BlenderBiologyCursorAI

Exploring Proteins

Investigating the world of microbiology through 3D visualisation

Created: 7/1/2025Updated: 8/25/2025

Introduction

I've always been fascinated by the idea of extending human lifespan. While the concept of uploading consciousness to a machine doesn't appeal to me, I'm drawn to the biological approach of reversing the aging process. After reading "Lifespan" by David Sinclair and "Ageless" by Andrew Steele , I was encouraged by the growing number of researchers working in this field, though the timeline remains scarily far away. Imagine being the generation who starts to die off just as they invent the cure for ageing!

Wanting to contribute somehow, but not yet having a background in biology, I thought I'd tackle this through something I do know. At the 2022 Blender Conference, I attended a presentation by Brady Johnston about Molecular Nodes, a Blender extension that allows the visualisation of proteins within Blender, using the geometry nodes feature.

Beginning this project I don't really have any prior knowledge in biology, save for what I did in school during my Biology GCSE so I'll likely be exploring tangents I find interesting along the way.

Objectives

To give me some direction as I work in this area I've decided to set some goals for myself. They're pretty flexible and I don't want to fence myself in too heavily, though I feel these three will point me in the right direction:

  • Get more familiar with the science behind molecular biology
  • Build Molecular Nodes locally and contribute something to the project
  • Create a website to create interactive visualisations of any given protein

These are my initial goals, though at the moment there are a lot of unknown unknowns. It could turn out that I don't end up using molecular nodes and Blender for the final site I build. We'll have to wait and see!

Implementation

Creating a viewer

The first thing I wanted to do was to launch molecular nodes, export a glb and get it displayed on this page. The first step would be to create a 3D model viewer that I could reuse all over this site - you can see this below displaying the classic Blender monkey Suzanne

Wow, look at that! A viewer created using threejs displaying the Blender monkey Suzanne. That was implemented mostly by asking the Cursor agent to create a model viewer component for me. The first shot was pretty good, just needed some slight lighting tweaking and adding the ability to add a caption. Easy! Now we have a reusable component to use for visualising 3D models. It doesn't do anything too fancy just yet, though it does provide basic controls. Further down the line I might add drei to help with that, but for now let's get onto downloading some proteins!

Identifying a protein

It's a funny idea, downloading a protein. Thankfully there's a really handy repository of all known proteins called the Protein Data Bank (PDB), this is a global repo that contains information on the 3D structure of all known proteins. It's been going for over 50 years and hopefully should be an easy way to get the information required to visualise a protein molecule.

To begin I thought I'd look into the current molecule of the month - at the time of writing this is Beta-Lactamase. The name means very little to me at the moment, though it's got to be pretty important as 75% of antibiotics we use today are based on it. Here's the image from the summary page, it's a lovely toon-shader view of a molecule, though i'm no closer to knowing what a Serine 70 or a lactam ring is.

Beta-Lactamase protein structure

Looking into how the PDB works, it turns out that every protein that is discovered is given a unique identifier in format of a digit followed by 3 letters, case insensitive. As a new protein is discovered and submitted to the PDB, it's given an sequential identifier incrementing the characters, then eventually the numbers. For example, if 1AAA was the identifier of the first protein then the next submitted protein would be given 1AAB. Human haemoglobin that was figured out a while back is 1HHO, though the more recent coronavirus structure is 7PZK.

Anyway, back to our new favourite protein - Beta-Lactamase. It has a unique identifier of 1XPB and a lot of complicated statistics on its PDB page about how the structure was identified and how many atoms are in it (2,167) and how much it weighs (the same as 29,040 hydrogen atoms). I don't really care about this for the minute though - I want to know how those 2,167 atoms are represented in data.

It's worth noting that the PDB page did have a 3D protein viewer, as does every protein in the PDB. I chose to ignore this entirely as I wanted to approach this project from first principles without getting a sneak peak at what it was I wanted to do.

Downloading a protein

The PDB has a handy button to download the associated files - my first thought was "wow that's easy!". Unfortunately not. Clicking this button opens a drop down with all kinds of file formats that I've never heard of before: PDBx, mmCIF, PDBML/XML, 2fo-fc coefficients, biological assembly files to name a few.

I thought I had a pretty good handle on the 3D model formats out there, having messed around with both a bunch of CAD formats and mesh representation formats for work such as the gorgeous gLTF format format. I'd never even heard of any of these weird biological formats before though! I thought I'd just download the first one in the list, the 1XPB.fasta file and open it up. This is what I saw:

HPETLVKVKDAEDQLGARVGYIELDLNSGKILESFRPEERFPMMSTFKVLLCGAVLSRVDAGQEQLGRRIHYSQNDLVEYSPVTEKHLTDGMTVRELCSAAITMSDNTAANLLLTTIGGPKELTAFLHNMGDHVTRLDRWEPELNEAIPNDERDTTMPAAMATTLRKLLTGELLTLASRQQLIDWMEADKVAGPLLRSALPAGWFIADKSGAGERGSRGIIAALGPDGKPSRIVVIYTTGSQATMDERNRQIAEIGASLIKHW

What an absolutely tiny file to represent a protein at 318 bytes. I thought I knew how to optimise 3D assets for the web through tools such as Draco , though that rarely gets anything below a few kilobytes.

Some brief research had me come across something I've heard of before - protein folding. While the .fasta file contains the primary sequence of amino acids (the building blocks of a protein), it doesn't tell you how they are folded in 3D space and therefore has no info about atom coordinates or bond angles. Thankfully, a few years back some clever researchers at Google DeepMind figured out how to predict the structure of a protein soley from it's amino acids. The tool they developed is called AlphaFold and is really interesting to poke about with. There exists a AlphaFold page on Beta-Lactamase. At the top of this page are 3 download links: PDB file, mmCIF file and Predicted aligned error.

Given that AlphaFold displayed these 3 files for download had me thinking that the PBD and mmCIF formats would have the information needed to visualise a protein in 3D. I downloaded the files and immediately noticed they had hundreds of kilobytes of information - this had me thinking I was on the right track. A quick bit of research and I found out that the PDB format was developed in the 1970s (wow!) while the mmCIF was the more modern version developed in the 1990s that handles complex data a bit easier. I also found out that mmCIF stands for "macromolecular Crystallographic Info File" - I don't know what that means yet so I'll just keep referring to it as either mmCIF or the .cif format for now.

Exploring Molecular Nodes

Stepping back a moment, now I've learnt a little bit about how proteins have their information stored, I'm going to step sideways a bit and create my first proper protein visualisation. As mentioned, I know a tool called molecular nodes exists already. A long while back I opened it up and imported a protein, following the guide exactly without much understanding of what it was actually doing. So, first things first let's load in the 1XPB protein and see if we can get it looking lovely.

I opened up Blender 4.4, enabled the molecular nodes addon and proceeded to look at the sidebar it provided. Going into the scene tab there's a handy little field to add a PDB id, a way to select the file format (I chose .cif) and a button called fetch. Once I'd entered X1BD and hit fetch this beautifule molecule was loaded into my scene:

Blender molecular nodes sidebar

What a gorgeous assortment of spheres and connections. If I were a bacteria I'd for sure be scared of it!

Getting down to business, let's look at the actual stats of this molecule. To load it in added 2,033 objects to the scene using 65,862 vertices. Quite a lot going on there, significantly more than the small string of characters we saw above in the .fasta file.

While cool, this isn't an optimal way to visualise the geometry in the browser. Right now there's a bunch of geometry being created and while we can instance it, I really want to play around with a particle system.

Things I Learned

  • The PDB is a handy open database that contains information on all proteins, including information on what DNA makes them and how they're physically structured.

Next Time

This section will discuss future improvements and next steps for the project.