The application of
MALWMRLLPLLALLALWGPDPAAAFVNQHLCGSHLVEALYLVC GERGFFYTPKTRREAEDLQVGQVELGGGPGAGSLQPLALEGSLQ KRGIVEQCCTSICSLYQLENYCN
Theresa Fiott is a Software Developer in the Health Team within the Programme Management Department at MITA. She is fascinated by all things medical and is especially intrigued by any potential advances technology can bring to the medical field.
That sequence of letters defines the structure of human insulin from which all its biological properties derive. It is the first ever established aminoacid sequence of a protein - a chain made up of 110 amino acids - discovered by biochemists in 1951. This discovery was the stepping stone to the synthesis and mass-production of human insulin for use by diabetics all over the world. Now, almost 70 years later, analysing protein sequences like these remains a central topic of bioinformatics in all laboratories throughout the world.
Before the era of bioinformatics, only two ways of performing biological experiments were available, either within a living organism (in vivo) or in an artificial environment (in vitro). Therefore, experiments to gather information about proteins was understandably very limited. Proteins are found in all living things and are all made up of the same basic building blocks called amino acids. There are 20 different amino acids and each one is referred to by a specific name such as: Alanine, Glycine, Lysine, Proline etc. These can be symbolized by one or three letter codes, so taking the case of Alanine it can be represented as ‘A’ (1-letter code) or ‘Ala’ (3-letter code). Each protein can be made up of 100 to 33,000 of these 20 kinds of amino acids linked together in a chain and the precise sequence of their arrangement results in a different three-dimensional protein. The sequencing of insulin inaugurated the modern era of molecular and structural biology. Biology got a taste of its first fundamental dataset: molecular sequences. Since the 50s was a time before computers existed, sequences were assembled, analysed, and compared by (manually) writing them on pieces of paper, taping them sideby-side on laboratory walls, and/or moving them around for optimal alignment (now called pattern matching). As soon as the early computers became available the first computational biologists started to enter these manual algorithms into databases as memory banks. This practice was brand new since nobody until them had to manipulate and analyse molecular
sequences as texts. Most methods had to be invented from scratch, and in the process, a new area of research — the analysis of protein sequences using computers — was born. This was the genesis of bioinformatics.
Since its conception, bioinformatics has grown into an interdisciplinary field of science, combining biology, computer science, information engineering, mathematics and statistics, the combination of which is serving to analyse and interpret biological data allowing for essential discoveries such as the differentiation of the proteins produced by a certain type of bacteria versus those of the protein forming the coat of a particular virus, and the ability to identify mutant proteins that would expose whether an organism has a genetic disease, amongst many others. If the proteins’ structures and amino-acid sequences are known, new medicines can be discovered by simulating their interaction with certain smallmolecule drugs. If the signature proteins in anthrax spores and botulism toxins could be accurately detected, it would be possible to provide an early warning about the proximity of biological warfare weapons. When mutations occur in the DNA, it is the proteins that are ultimately affected, thus their analysis could give insight to why certain mutations occur and what their global effect could be.
This computational branch of molecular biology is an umbrella term encompassing many fascinating and interesting fields not only the study of proteins (proteomics), but also the study of genes (genomics), drug design and delivery, and many others, all conceived by biologists’ desire to answer biological questions and all highly significant in their own way. Bioinformatics is at the centre of the most recent developments in biology, such as the deciphering of the human genome, systems biology (trying to look at the global picture), new biotechnologies, new legal and forensic techniques, as well as the personalized medicine of the future.
Nowadays, technology is being used in all aspects of science and medicine, from helping doctors diagnose patients to recommending to governments the research they should fund based on their nation’s needs. The contribution of bioinformatics to state strategies on life sciences innovation has become increasingly popular with governments. Announcing a £32 million investment in bioinformatics in February 2014, the UK Minister for Science David Willetts