Category: Genetic Algorithms

  • From Scratch to Streamlined: Comparing My Hand-Built Genetic Algorithm with sklearn-genetic

    From Scratch to Streamlined: Comparing My Hand-Built Genetic Algorithm with sklearn-genetic

    After building a genetic algorithm from scratch in Jupyter, I wanted to see what would happen if I used a library instead. Specifically, I tried out sklearn-genetic, a tool that wraps genetic feature selection into a few clean lines of code.

    The difference is incredible. My original notebook was over one hundred lines of code. With sklearn-genetic, the same process became a single call:

    selector = GAFeatureSelectionCV(
    estimator=DecisionTreeClassifier(random_state=RANDOM_STATE),
    cv=CROSS_VALIDATION_SPLITTING,
    scoring=SCORING_STRATEGY,
    population_size=POPULATION_SIZE,
    generations=NUM_GENERATIONS,
    mutation_probability=MUTATION_RATE,
    )
    selector.fit(X, y)

    It worked beautifully. But it’s worth thinking about what I gained and lost with the different approaches.

    What the Library Does Well

    • Speed of implementation: No need to write selection, crossover, or mutation logic. It’s all built in.
    • Robustness: It easily handles edge cases, parallelism, and scoring strategies.
    • Integration: Fits seamlessly into scikit-learn pipelines and workflows.
    • Convenience: You can run a full GA in minutes, with clean syntax and very little code.

    Certainly the library has some big advantages, as libraries really should! 😀

    What I Missed from Building It Myself

    • Visibility: In my notebook, I saw every generation evolve. With the library, that process is hidden.
    • Control: I had access to the state of the system at all times, so I could change parameters or visualize data in the middle of a run.
    • Learning: Writing the GA by hand taught me how each operator affects convergence, diversity, and exploration.
    • Philosophy: My notebook felt like a real experiment. The library felt like a tool.

    The approaches serve different purposes. But if your goal is to actually learn genetic algorithms, building one yourself is irreplaceable.

    Side-by-Side Summary

    AspectHand-Built GAsklearn-genetic
    TransparencyFull control over internalsAbstracted
    FlexibilityEasy to customize logicLimited to API
    SpeedSlower to build, faster to understandFaster to run, harder to inspect
    Learning ValueHighModerate

    Final Thoughts

    Using sklearn-genetic felt like using any library, you hand off control. It’s efficient, clean, and powerful. But building the algorithm myself taught me how the engine works, how selection pressure shapes populations, how mutation keeps diversity alive, and how exploration leads to clarity.

    If you’re just trying to get results, use the library.
    If you’re trying to understand the process, build it yourself.
    And if you’re trying to do both — start with the notebook, then graduate to the tool.

    – William

    My notebook can be found in my GitHub repo here:

    Genetic Algorithm Notebook

  • When Code Evolves: Learning Genetic Algorithms Through a Simple Notebook

    When Code Evolves: Learning Genetic Algorithms Through a Simple Notebook

    There’s something cool about watching a solution evolve. In this case, it was a population of digital organisms competing, mutating, and adapting until a solution emerges. That’s what genetic algorithms do, they take a problem that may feel too tangled to reason about directly and then explore and optimize to a final solution.

    After reading about genetic algorithms, I wanted to understand them more deeply, not just in theory but also in practice. So I opened a Jupyter notebook, loaded a simple dataset, and built a genetic algorithm from scratch. No libraries, no shortcuts. Just Python, NumPy, and a willingness to let evolution take over. Just like my first experimentation with Naive Bayes Classifiers.

    I chose a simple dataset on Kaggle that contains heart disease data. I chose this dataset because it isn’t too large but has a decent set of features to use for optimization.

    A Simple Idea: Evolving Feature Sets

    The experiment was straightforward: Could a genetic algorithm discover the best subset of features for predicting heart disease?

    Each potential solution was represented as a row of 0s and 1s that indicate which features to keep and which to remove. So for example, a row might look like this:

    [1, 0, 1, 1, 0, 0, 1]

    That means “only use features 1, 3, 4, and 7.”

    It’s a really simple encoding. How that is translated to biological terms is: each 1 or 0 is a gene, each list of 1s and 0s is a genome, and each generation is a chance for something better to emerge.

    How the Algorithm Works (In Human Terms)

    The notebook follows a classic evolutionary loop:

    1. We start with a population of random individuals, made up of a subset of features of the dataset

    Many of them may be terrible but that’s ok. Evolution doesn’t actually need a good starting point, just variation.

    2. Evaluate each individual

    For every set of features, we train a small decision tree using only those features. The accuracy of the tree becomes the “fitness score.”

    3. Select parents

    We use tournament selection: pick two individuals at random, keep the better one. It’s very simple, but it pushes the population toward improvement.

    4. Crossover

    Two parents randomly combine their “genes” to create a child. Some genes from one, the rest from the other. This is where new combinations emerge.

    5. Mutation

    Every 1 or 0 has a small chance of flipping. This simulates mutation, the spark of creativity. The thing that keeps evolution from getting stuck.

    6. Repeat for many generations

    And watch the accuracy climb. The notebook prints out the best accuracy of each generation, like this:

    Generation 1: Best Accuracy = 0.7692 Generation 2: Best Accuracy = 0.8022 Generation 3: Best Accuracy = 0.8352 Generation 4: Best Accuracy = 0.8242 Generation 5: Best Accuracy = 0.8352

    It’s like watching a species learn.

    What I Learned by Building It Myself

    The most fun part of this project wasn’t the accuracy score or the final feature set. It was what I learned by writing the code myself. When you don’t rely on a library or a prebuilt GA tool, you’re forced to think through the problem directly. You get a feel for the algorithm.

    That helps make it all click.

    Just as I read, genetic algorithms don’t assume the world is smooth or predictable. They don’t need gradients or clean math. They don’t freeze when the search space is really messy. They just explore, adapt, and keep going and going. Watching that happen in code, watching the population of feature selections slowly learn which features matter more, made the philosophy behind GAs feel real in a way that reading about them or using a library never would.

    It showed me that in complex systems, you don’t get to reason your way to the perfect solution upfront. You need to start wide, stay curious and let patterns emerge before you decide what matters. Writing the notebook by hand was a lesson in how exploration leads to clarity.

    Why This Notebook Is a Great Playground

    Because it’s small, clear, and easy to modify. You can:

    • swap in a different model
    • evolve hyperparameters instead of features
    • visualize fitness over time
    • and a lot more

    It’s a simple sandbox for learning how evolutionary computation works.

    When you see a population of solutions improving generation after generation, it’s hard not to appreciate the elegance of genetic algorithms.

    Closing Thoughts

    Genetic algorithms aren’t the hottest technique in machine learning anymore. But they are still pretty cool and were a very important part of the evolution of data science. They show us that exploration is not a waste of time, it’s a strategy. That creativity can be computational. And they prove that sometimes the best solutions emerge from processes we don’t control.

    Building one in a notebook made that lesson tangible. And honestly, it made me appreciate evolution, both biological and computational, in a whole new way.

    – William

    My notebook can be found in my GitHub repo here:

    Genetic Algorithm Notebook

  • Genetic Algorithms

    Genetic Algorithms

    I owe you an apology. I disappeared for a bit because I was trapped in the ninth circle of the college‑application process, otherwise known as “writing essays.” So many essays. I needed a rest! But I’ve resurfaced, slightly overcaffeinated and definitely overshared, and I’m ready to talk about something far more relaxing than applications: evolutionary computation.

    I stumbled across this branch of data science and found it so cool I started reading more and wanted to share what I found. Evolutionary computation is cool because it feels like the closest thing computer science has to science fiction. It is one of the rare techniques where the machine genuinely surprises you because it’s creative in a way that mirrors nature itself.

    When Algorithms Evolve: The Story of Genetic Algorithms

    There is something so cool about the idea that algorithms can evolve. Not in the metaphorical sense, but in the literal, biological sense, where solutions compete, adapt, and survive the same way living things do. It’s a reminder that not everything has to follow a straight line.

    Genetic algorithms grew out of that spirit of curiosity. They offered a way to explore problems that were too messy or too unpredictable for traditional methods. For a while, they captured the imagination of researchers, engineers, and artists, because they made computation feel creative. Even today, long past their time in the spotlight, they still have a strange and enduring charm.

    What Genetic Algorithms Actually Do

    Genetic algorithms search for solutions by treating them like organisms in a population. Each candidate competes, reproduces, and mutates, and over generations the population evolves to a solution. They explore, stumble, adapt, and occasionally discover solutions no human would have thought to try.

    This makes them especially useful in problems where intuition fails and the search space complex.

    A Brief History

    As I mentioned, genetic algorithms emerged in the 1960s through the work of John Holland at the University of Michigan. His research asked a completely new question. If evolution can produce complex organisms without a designer, could computation do the same?

    By the 1980s and 1990s, this idea had spread far beyond academia. Engineers at NASA once used a genetic algorithm to design an antenna for a spacecraft. They fed the system a set of constraints, pressed go, and watched as each generation twisted itself into stranger shapes. The final design looked like a bent paperclip someone had stepped on, but it outperformed every hand‑crafted alternative.

    In architecture studios, designers sometimes let genetic algorithms “grow” building facades. The results look like coral reefs or alien cathedrals, and half the fun is seeing what the algorithm thinks is beautiful.

    At that time in computing history, genetic algorithms were more than tools. They were a philosophy. They suggested that creativity could be computational and exploration could be automated.

    Why Genetic Algorithms Were So Beloved

    They could function in messy data. If you had a problem where the objective function was noisy, discontinuous, or simply weird, a genetic algorithm didn’t care.

    They were very flexible. You could encode almost anything, from bits to rules or even a neural network architecture, and evolution would happily go to work. This flexibility made it feel like they could solve any optimization problem.

    They were also intuitive. People already understood evolution. The user didn’t need to understand calculus to understand how genetic algorithms work. That made them an entry point into computational thinking for many students and researchers.

    What I love about genetic algorithms is that they take something that is too complicated to understand all at once, start wide, stay curious, and only narrow in when the evidence earns it. That feels very relevant outside of computation.

    Curiosity on its own doesn’t guarantee efficiency. As the field grew, machine learning shifted toward methods that rewarded precision over exploration.

    Why Genetic Algorithms Faded from the Spotlight

    As machine learning matured, new methods emerged that were faster, more efficient, and more predictable. For example, Bayesian optimization offered a smarter way to search with fewer evaluations.

    In many areas of data science, genetic algorithms couldn’t compete with the speed and precision of newer techniques. Genetic algorithms were powerful generalists in a world that increasingly rewarded specialists. Their strengths did not disappear; they just became less central as the field evolved.

    Where Genetic Algorithms Still Shine

    Today, genetic algorithms still thrive in places where the search space is too messy for calculus and too highly dimensional for brute force.

    In engineering, genetic algorithms are still used to design structures that must balance competing constraints such as strength, weight, cost, and manufacturability. These are problems with no clean analytic path. In robotics, evolutionary strategies help discover control policies that are robust in the face of noise and uncertainty. In creative fields, genetic algorithms still generate art, music, and architectural forms.

    Even in machine learning, they have not disappeared. Genetic algorithms are used for feature selection when the relationships are too tangled. They are used to evolve neural network architectures in ways that other techniques cannot.

    Genetic algorithms survive because they explore. Exploration is still necessary. In messy, unpredictable environments, exploration isn’t a luxury. It’s how systems can stay resilient.

    A Reflection Through the Lens of the Early Signal Project

    The Early Signal Project does not use genetic algorithms directly, but the philosophy behind them resonates deeply with the work.

    Genetic algorithms show us that diversity is not noise. It is information. Populations thrive when they contain many possibilities, not when they are homogenous. Premature convergence is risky because it shuts down other possibilities. Exploration needs to come before optimization. Good solutions emerge through iteration, not assumption.

    Early detection in education requires the same mindset. The goal isn’t to force everything into a single pattern. It’s to let patterns emerge honestly, even if they surprise you.

    You should not rush to conclusions about a student based on a single signal. You need to explore patterns without prejudice. Let insights emerge from the data. Refine carefully, ethically, and with humility. Protect the diversity of student experiences rather than trying to force them into one narrative.

    In that sense, the Early Signal Project is its own evolutionary system. It evolves toward clarity, fairness, and early support for students.

    Closing Thoughts

    Genetic algorithms remind us that progress can often come from embracing variation rather than suppressing it. They show that creativity can be computational, that exploration is as important as optimization, and that surprising solutions often emerge from processes we do not control.

    In data science and in education, solutions evolve when we give room to breathe and adapt.

    Sometimes the most powerful ideas are not the newest ones. They are the ones that remind us how to think and remind us how to stay curious.

    – William

  • Short Post

    Short Post

    Just a quick ‘nothing’ post this time. I’ve been very busy with college applications and the holidays. I’ll post again soon once application season settles down a bit. I stumbled across an interesting on genetic algorithms. That sounds so interesting I think my next post may be on that topic!

    – William