Building Blocks: Toys or Tech? – Matthew H Bernard

I decided to write this post after back-tracking several times with other post drafts, and realizing it was always to describe a smaller part of a greater process. There are entire textbooks, degrees, and careers dedicated to it but I haven’t found any that make analogy to the “central dogma” of molecular biology using everyone’s favourite toy – Lego!

There are tons of good explanations out there and likely better than mine but this is my take on it for the lay person. I’ll do my best, but I also have to ask something of you: if you want to be able to effectively discuss related topics, you have to do your part to try to understand them. I’ve been involved in so many conversations or initiatives that appear to be in good faith, but are stuck at a certain level of outdated understanding and thus are moot. Waste of everyone’s time.

Then again, there are a lot of educators that love the sound of their own voices and aren’t conveying things for the purpose of understanding…

In other words, I think everyone can do a better job at improving dialogue around biotech topics, regardless of one’s stance, because this is where things are at today: technology is never going to stop advancing & evolving, but people are choosing to stop keeping up with it (and in some cases, unaware of their gaps in understanding):

Note that this is going to be an oversimplification and is by no means a complete guide to the natural world. I’m only going to zoom in as deep as the elements, and zoom out as far as the individual oRgaNiSm, but there are “layers” wayyyyyyy beyond these in both directions. This video shows what I mean (you might want to mute the sound, the video is already anxiety-inducing on its own):

If it’s been a while since you played with Lego, or thought about organic chemistry (branch of chemistry that is concerned with carbon and especially carbon compounds which are found in living things), don’t worry I’m going to to use visuals and explanations. Actually, it’s only really important to know about 3 or 4 of the elements to understand the context of what I’ll go into here, and even in other posts. In some ways, genetics is more straightforward than Lego! I’ll show you what I mean.

This post will be divided into 3 parts:
1) general chemistry & molecular concepts
2) how those concepts relate to biology (genetics)
3) analogy between Lego & biology

(A few disclaimers: this isn’t a 100% perfect parallel because…well…it’s nothing is perfect. Also, all Lego photos are used with permission, ©2021 The LEGO Group. This blog post is an individual publication, not sponsored, endorsed or connected with the LEGO Group. Lastly, no nerds were harmed in the writing of this post.)

PART 1

As we saw in that first video, one could choose some deep level of quantum physics as a starting point (and I’m sure there are levels beyond that that we can’t comprehend) but for the purposes here, a reasonable arbitrary starting point is the atomic level.

An atom is “…the smallest object that retains the properties of an element…[whose] chemical behavior is determined by the arrangement of its electrons…[and that] may combine with one another by chemical bonding to produce molecules.”

In other words, take a bunch atoms that are the same and let them join together in a 3-dimensional way, and you end up with a pure substance, or an element.

A nice definition of “element” is: “…a substance whose atoms all have the same number of protons…chemically the simplest substances and hence cannot be broken down using chemical reactions. Elements can only be changed into other elements using nuclear methods.”

In other words, atoms can differ from one another by how many sub-units (protons & electrons) they’re comprised of; when the join (chemically bond) together, they form molecules. (And when the molecules are comprised of atoms that are all identical, they are molecules of an element.)

Here’s a periodic table of the elements, annotating these differences:

And, a strange animated one just because it looks cool:

So then what happens when you have atoms that AREN’T the same, toss them together and mix them up?

Well, they still join together and make random 3D shapes (unfortunately not shaped like baked chicken), but this time atoms that vary (in number and in arrangement electron/proton/neutrons as mentioned above) from one another might join together differently than they would in an element. These 3D shapes are generally called “molecules.” When illustrating molecules, there is a convention to use certain colours consistently with certain common atoms to make them easily identifiable (carbon is black, hydrogen is white, oxygen is red, nitrogen is blue, phosphorous is orange, and so on). Here are some common molecules you’re familiar with – based on that colour coding, can you guess what they are??

Therefore, “molecular” would generally mean “of or relating to molecules.” (See my post about fatty acids for a few other examples of common molecules that you eat.)

Ok let’s pause for a minute and recap, and make a long story short: atoms are made up of a few basic particles, and differ in number of said particles. Atoms join together in various arrangements in space. Those arrangements dictate 3D structures once they join together, called molecules, each with their own properties; their properties are dictated by the 3D shape, and by their composition. If only identical atoms join, the molecules form elements. These concepts are consistent for everything – you, the screen you’re reading this on, your desk plant, your cat, etc.

See, chemistry isn’t too tricky, right?

PART 2

Now in the context of genetics, there are 4 other molecules I want to focus on; they’re a bit bigger & more involved than the previous 4 examples, but all fall under the general category of “nucleotides” or “base pairs.” (Side note: nucleotides (“nts”) or base pairs (“bps”) can be used as a unit of measurement of length when talking about genes.) Proper names are below; not important to know the differences or full names, but let’s just call them A, T, G, and C to keep it simple:

Check out these interactive versions! You can rotate them around as you wish to see their 3D shapes.

Interesting thing about these ones is that they tend to want to join together in the same way all the time, particularly in pairs (A with T; G with C) shown with the dotted lines (ignore the letters/numbers):

To date, the structures of these 4 molecules and their natural affinities for one another are the same in plants, animals, fungi, bacteria, viruses, and archaea (all “organisms”). They’re the same whether these organisms reproduce via sexual recombination, apomixis, cloning, genetically modified, or synthesized artificially. If the structure of a nucleotide changed, this would be referred to as a mutation which has immediate consequences at a micro level, and often a domino effect at a macro level. They cannot vary. These four molecules are absolutely fundamental in viable biological organisms.

When phosphorous is added in a certain way, those joined pairs want to “stack together” in a perpendicular (relative to their pairing) fashion – a bit tricky to show on a 2D screen, but something like this:

The nucleotides face each other, with the phosphorous on the outer side. Zoom out a bit and it starts to look something like this:

The pairs are still connected laterally, but now the pairs are joining vertically; the phosphorous starts creating a “backbone” twisting around the outside, with nucleotides paired inside. Zoom out further yet, and this is what they look like:

Here’s a 3D illustration that you can rotate/move/zoom to better understand the structure (keep in mind, it’s only a fraction of the total length of these strands that stack in nature).

Now it gets to be a pretty messy diagram, so a simplified version is often drawn like this:

Or this:

Or simply, this:

And thus, the staggered, spiralled stacking of these base pairs forms HUGE super stable molecules that most are familiar with: the double-helix of DNA (in nerd terms, DeoxyriboNuecleic Acid). These graphics are zoomed in on only a few dozen base pairs, but DNA molecules can naturally be millions of base pairs in length. Remember earlier, I mentioned base pairs are simplified and represented using 4 letters? Here’s what that looks like in a more practical way: for example, when the genome – the correct sequence of ALL base pairs – of thale cress is known and viewable; and a random example of the sequence of a gene from the same plant. (NCBI is the public “go-to” database for any published genetic info from any publication – you can actually search pretty much anything there and see sequences, origins, publication references, etc.):

This looks like gibberish, but it’s essential this is understood for so many other concepts that are built from it in modern agriculture, and beyond. That’s why I’m trying to lay the groundwork in-detail here, so that even without formal training, anyone can have more effective conversations around such technology. BUT, really it’s not too bad, note this sequence above is comprised of only 4 different letters (and now you know what they represent!).

The gene technically isn’t “doing” anything. Rather, there are proteins (just more big complex molecules, made up of mostly the same handful of atoms above, but arranged in different ways) floating around that “read” these sequences. “Reading” genes triggers these proteins to grab other small molecules, like peptides, assemble them in yet again more new ways, to create MORE proteins. The newly-assembled ones have new functions; given that genomes have millions of base pairs (thousands of genes), there’s a plethora of what these functions are! In a few minutes, you’ll see my Lego analogy of this.

PART 3

Now we can start to make analogies between genetic events of biological systems, and the process of building a Lego kit.

Essentially, there is a 3-step process in both biology, and building Lego:

A) instructions are required as the “blueprint” of what is to be built;
B) materials and intermediate mechanisms are required to make sense of the blueprint and help assemble the parts;
C) an end-product with some specific function results as the output.

Now let’s break down each step:

A) Instructions

We’ve already discussed the entire set of genes – the genome – of the plant. Inside the nucleus of each cell, the genome is found scrunched up in and dispersed across groups called “chromosomes” that make it easier to store so much information. The “nucleus” in the Lego example would be a box full of ringed binders (“chromosomes”) of instruction booklets (“genes”) for building several Lego toys (“proteins”): let’s say, a payloader, forklift, excavator, truck, etc. (that all have different but complementary jobs to do):

(not a plant cell but it’s a good illustration for this point)

(As I mentioned at the beginning, we’re jumping into an arbitrary starting point of the cyclical reality of nature; the chemical make up of genes (as described earlier), and of the paper/ink of the Lego instructions (not described), are made of arrangements of molecules listed in the periodic table. So this is our starting point – after the chemistry has already been assembled into “instructions.”)

Let’s just zoom in and focus on one gene encoding for one protein, or the booklet to make just the forklift. With you, the people building it, this is the full complete package, the materials and instructions needed to put the pieces together in the proper way, in order to create an end-product that does a specific job; the gene, within the chromosome, within the genome inside the nucleus (with molecules floating around) are the cellular equivalent. In the case of Lego, the goal is to create a forklift that lifts things; in the case of the plant cell, it’s to create a protein that does something:

This isn’t quite accurate…many different proteins with different functions are involved in cellular processes even though the instructions are tied up specifically inside the nucleus; some proteins are inside the nucleus where the genome is, some are outside the nucleus with other things going on, but all within the cell. The Lego kit is the “nucleus,” and people are the “proteins” that have their own tasks to carry out, and the living room is the “cell.” Let’s assume you’re building the kit with your kids that can’t read yet, so this is actually a more realistic overview:

Now let’s zoom in again. Each sub-unit (the gene in the cell, or the booklet amongst the many others in the ring binder) of the complete set of instructions have bits of info throughout that need to be read by a third-party and assembled properly according to those instructions. For simplicity’s sake, let’s just keep it to one gene. One booklet:

Now, remember the letters of the genetic sequence represent the 4 different molecules above. More precisely, each letter actually represents TWO molecules – the two that naturally pair up together (and slowly twist into the double helix as the pairs stack) as we looked at earlier. This isn’t the best analogy, but think of it basically like vinyl window stickers that spell something out:

The stickers could be packaged already cut out, but way more stable to do that with the perfectly-paired inverse of the part that does make sense (the letters).

From what is known to-date, the pairing results from the high stability of the affinities between the paired molecules in the DNA – not necessarily because both sequences (on opposing sides of the helix) contain meaningful information. The side that does have meaningful genetic info (the genes) is like a sentence that must be read in one direction – the “sense” strand of the DNA. The opposing strand of the helix – the part that is not annotated in the A, T, G, C sequences above – is simply called the “anti-sense” strand.

B) Build

To begin the building process, the instruction book has to be opened up in order for you to read the instructions and start building. The same is true for the double-helix of a gene. For the sake of this analogy, let’s also assume you’re building this kit with your two illiterate kids, Birthday Kid and Helper Kid. Also, let’s assume the booklet wouldn’t lay flat on the floor because it was creased so much that it wouldn’t stay open, and also that Birthday Kid is having a tantrum and ran away to build a couch-cushion-fort to hide in (who also took the bags of Lego pieces). So, Helper Kid opens up the booklet and has to hold it open; similarly in the cell, a protein “opens” (pulls apart the double stranded helix) of the DNA where the specific gene is located:

Your job as the parent is to read the text and understand the language. Consider this: In the English language of 26 characters, with countless combinations of those characters (and the shortest word being 1 character, the longest having 36), there are >1 million unique words that can be arranged into sentences, that then relay meaning to the reader. The alphabet in genetics contains only 4 characters: A, T, G, and C. These letters are always, and only, arranged in 3-letter combinations to form the “words,” or in this context, called “codons.” So, there are only 4³, or 64, unique “words” in the genetic “language.” This cannot change, but conversely, new English words are being invented all the time. Similar to English, though, the sentences made by unique arrangements of the language’s words can vary a lot in length. In the cell, the “sentences” are the genes, comprised of “words” (codons) and also vary in length. In both cases, in any given sentence, the words can be arranged in ways that either have a clear message, or don’t.

Now, let’s say the booklet was text-only, with no pictures, but you still wanted Birthday Kid to be the one to physically assemble this forklift from inside the couch-cushion-fort. One way to articulate the instructions would be to convert the text into drawings (like you’d find in any Lego kit already); let’s call this a “transcription” process, as you’d be transcribing written word into a different visual representation (illustrations) but maintaining the same meaning as the text-only version:

On the left, these professional-grade Lego instructions are drawn by you, the reader/transcriber. On the right, your equivalent is the purple blob (which is actually a protein): this protein’s job is to read the sentence (the A, T, G, C sequence) and transcribe the original into a simplified version – more or less a single-stranded, linear molecule also comprised of nucleotides, but way less stable than DNA because it doesn’t have the paired anti-sense strand to form a helix, nor the phosphorous “backbone.” They’re much shorter than DNA too.

(a fifth nucleotide variant is actually used in RNA sequences (U instead of T) but that’s beyond the scope here).

Anyways, once that blob (protein) transcribes the DNA into RNA, it can now be handed over to another protein to do something with it. In the Lego example, the sheets you’re drawing on, not bound in a booklet and transcribed from the text-only instructions, are the less-stable but more-mobile RNA handed over to the Birthday Kid.

I’m going to get one step more specific, because there are different types of RNA; this one we’re talking about here is specifically “messenger RNA” or “mRNA” (the green strand in the previous side-by-side comparison). Why “messenger?” Because it’s like a simplified version of the full instruction set, which is handed over to someone that can do something useful with it. In the cell, it’s because it fits through the openings in the nucleus’ “wall” – basically, moving it from the safe storehouse of the nucleus, into the greater expanse outside of it (but still inside the cell) where more molecules, thingamajigs, doohickies are found that do other things inside the cell. In the living room, the mRNA is the amazing illustrations that you drew and handed to thingamajig – er.. Birthday Kid. Here’s a chart of a few of the other molecules floating around; this group is referred to as “amino acids”:

Now what?? Well now that the message has been delivered to the thingamajigs that can actually do something with the instructions – in this case, more but different proteins, or the Birthday Kid – the ACTUAL building process starts. The message of the mRNA must now be changed into a functional thing – in other words, the message is “translated” into a functional outcome. In the cell, that means translating mRNA into a protein; with Lego, it means translating the drawn instructions into the actual forklift.

In both cases, the thingamajig/kid needs to grab onto small molecules floating around, then assemble them as instructed by the message, into that bigger functional end-product molecule. In the cell, amino acids are assembled by the thingamajig protein to create New Protein; in the living room, the small pieces are the Lego blocks that Birthday Kid will build into the Forklift (You can see another type of RNA, “tRNA,” is involved at this step):

C) Output

Eventually, the amino acids joined to form long chains – “peptides” – also end up folding up on themselves, based on chemical affinity between the amino acids, resulting in proteins. (Note there are a few ways proteins are illustrated; the simplest way is just a big blob like we’ve been looking at so far, but other formats make it a bit clearer that these too are just unique arrangements of molecules joined together in 3D space):

different illustrations of the same protein (askabiologist.asu.edu)

Same idea as the nucleotides in DNA, but with peptides, the larger end-molecules are not nearly as regular or consistent. There’s a massive diversity of protein sizes and structures (and, thus, their functions too):

Similarly, Lego blocks can be assembled in nearly infinite combinations and sizes, also much more varied and seemingly more disorderly than the instruction set from which they originate.

And, voila. Using the proper set of instructions, with the necessary mechanisms and materials throughout the process, the useful end-product is born! To summarize this entire post:

In future posts, I’m going to refer to very specific nuanced aspects of this overall process, thus why I chose to go into very specific details in some parts. Without laying down the foundation, further discussion cannot make sense. This process will be referenced in discussions related to plant breeding, genetic engineering, natural processes and synthetic technologies, and more. Stay tuned!