Subtitles section Play video
Hello and welcome to the next module
in the Waters Peptide and Protein Bioanalysis Boot Camp.
My name is Khalid Khan, and I'm part of Health Sciences'
marketing team here at Waters.
Today, I will be presenting on peptide and protein structure.
So let's get started.
Here are a number of workflows for large molecule
biotherapeutic and protein biomolecule analysis.
Today, LC-MS is increasingly used for protein quantification
as an alternative to traditional ligand binding assays.
Proteins can be analyzed by LC-MS,
either using intact protein or surrogate peptide workflows.
Both tandem quadrupole and high resolution mass spectrometers
can be used.
Normal flow and microflow LC systems
are also commonly used with both of these mass spectrometer
systems.
Most of this module will focus on the surrogate peptide
workflow using tandem mass spectrometers,
and understanding your peptides and protein structure
is important when developing both intact and surrogate
peptide workflows.
The areas covered in this presentation
will be the basic structure of amino acids,
peptides, and proteins, including
a few specific examples, such as monoclonal antibodies.
The basic structure of peptides and proteins
has an impact on both the sample treatment and LC-MS method
development.
The ionization and fragmentation of peptides
and how these aspects differ from small molecules
will also be covered.
The presentation is mainly intended
for scientists who already have some experience
in small molecule LC-MS method development.
Their aim is to provide an introduction to peptide
and protein structure and explain
the commonly used terms in peptide protein LC-MS method
development.
The presentation will also prepare you
for subsequent modules in the Waters Peptide and Protein
Bioanalysis Boot Camp.
In this first section, let's look at the structure
of peptides and proteins.
Peptides and proteins are chains of amino acids joined together.
There is no agreed criteria that specifies
the length of an amino acid chain that defines whether it
is called a peptide or protein.
One common definition is that if the amino acid chain consists
of less than 50 amino acids, it is
called a peptide, and more than 50 amino acids,
it is called a protein.
This definition is not absolute, and you
can have large peptides and small proteins
of similar amino acid chain lengths.
All of human proteins are formed from just 20
naturally occurring amino acids, or 21,
if you include selenocysteine.
In terms of molecular weight, peptides
are typically less than 6000 daltons,
whereas proteins can be anywhere from 5800
daltons for a small protein such as insulin
or several hundred thousand daltons for large proteins
such thyroglobulin.
This slide illustrates the mechanism
of how two amino acids join together
to form a peptide bond.
The carboxyl group of one amino acid
reacts with the amine group of another amino acid
to form a peptide bond.
The resultant peptide will have a carboxyl group on one end,
and this is referred to as the C-terminal end.
The amine group is referred to as the N-terminal end.
As we will see later, these peptide bonds
fragment in a highly predictable manner in a mass spectrometer
collision cell.
Amino acids and peptides can exist as zwitterions.
This means that they can have both negative and positive
charges, depending on the pH.
This is an important factor when developing sample clean up
methods at the peptide level.
This will be discussed in more detail in later modules.
The chain of amino acids that form the backbone of a peptide
or protein is referred to as its primary structure.
Amino acids are usually represented by a single letter
or three letter abbreviation.
Here is the table of the 21 amino acids
from which human peptides and proteins are formed.
Some single letters are obvious, for example, G
for glycine and A for alanine.
Others are less obvious, such as K for lysine
and R for arginine.
As we will see later in this presentation,
lysine and arginine are very important when
we discuss the breakdown of large proteins
into smaller peptides using specific enzyme digestion.
This slide illustrates the wide variety of structures
and resultant chemical properties of amino acids.
The chemical structure of the amino acids
influences the polarity, hydrophobicity,
and acidic/basic nature of the resultant peptides
and proteins.
Note that cysteine contains a sulfur atom, which
means that two cysteine amino acids can form
disulfide bonds between them.
These disulfide bonds can form in the same peptide chains
or connect two different peptide chains.
I stated earlier that the diverse properties of peptides
and proteins have a large impact on the sample pretreatment
and LC-MS method development.
Note that some amino acids have a second amine group, which
means that they have multiple sites that
can be protonated to form multiply charged,
positive ions.
As the structures of all amino acids are well known,
it is possible to calculate the mass of a peptide
from its amino acid constituents.
Don't worry you will not have to calculate these manually.
Software tools are available to do this automatically for you.
Software tools, such as Skyline, will automatically
calculate the molecular weight of a peptide
from its amino acid sequence.
For example, the peptide D-E-V-I-L,
which consists of aspartic acid, glutamic acid, valine,
isoleucine, and leucine, will have a mass of 587.31662
daltons.
Note that the table above lists the monoisotopic mass
and average mass.
The monoisotopic mass is the mass where only the most
abundant isotopes are used in the calculation,
i.e., carbon-12, hydrogen-1, oxygen-16.
The average mass has all the minor isotopes
also included in the calculation, i.e., carbon-13,
deuterium, and nitrogen-15.
Proteins can exist in different forms and structures.
So far, we have only discussed the basic amino acid
sequence, which is referred to as the primary structure.
Amino acids can form hydrogen bond interactions
between each other, which influences the shape
of a peptide chain or protein.
The most common structures are a pleated sheet and half a helix.
Bonds and interaction between alpha helices
and pleated sheets result in tertiary structures.
Sulfa bonds between cysteine amino acids
and the peptide chains are common in tertiary structures.
Finally, when more than one different type of peptide chain
is involved, quaternary structure is produced.
This slide illustrates the primary structure
of insulin, which includes two amino acid chains joined
together, the insulin A chain and the insulin B chain.
The diagram on this slide also shows
a diagram of the tertiary structure of insulin.
Here is an example of a peptide drug, desmopressin.
This is a relatively small peptide
comprised of nine amino acids.
LC-MS development of a peptide of this length
can be treated in the same way as a small molecule LC-MS
method.
The peptide can be analyzed directly
by LC-MS and standards that are available for MRM method
development.
One difference from a small molecule ESI mass spectrum
is the presence of a doubly charged positive ion
in addition to the singly charged ion.
This is a key feature of peptide ionization
that will be discussed later in this presentation
and other modules.
Note the doubly charged ion at 535.22
and the singly charged ion at 1069.435.
An example of a small protein is insulin, which
consists of 51 amino acids.
The A chain has 21 amino acids, and the B chain
is 30 amino acids.
The monoisotopic mass of insulin is
5023.6377, which is outside the range
of tandem quadrupole mass spectrometer systems which
typically have a maximum upper range of below 2000 daltons.
However, as insulin forms multiply charged ions
with three, four, and five charges,
it can be analyzed using tandem mass spectrometers.
In this example, the five plus ion is shown at mass 1162.
Insulin also forms three plus and four plus ions.
Note again the disulfide bonds connecting the two amino acid
chains between two existing amino acids.
These are very common protein structures.
Here are some examples of larger proteins,
ranging from insulin like growth factor IGF-1
with the molecular weight of 7649
to thyroglobulin, which has a molecular weight over 660,000
daltons.
The slide also shows medium sized proteins,
such as CRP and apolipoprotein A1, which
have molecular weight in the mid 20,000 dalton range.
We can see that as the size of the proteins
increase, the challenge of measuring the intact protein
gets more difficult and is virtually
impossible using limited range tandem mass spectrometers.
However, we can break down large proteins into smaller peptide
units and analyze these peptides using tandem mass
spectrometers.
This approach is called a surrogate peptide approach
and is widely used in protein bioanalysis and protein
biomarker research.
Antibodies are a specific class of proteins
with a common structure.
They are large Y-shaped proteins with two heavy chains and two
light chains.
The heavy chains are linked to each other by disulfide bonds.
Sulfa bonds also link the light chains with the heavy chains.
Human immunoglobulins and antibody
produce white plasma cells to fight infections.
The heavy chains contains approximately 440 amino acids,
and the light chains contain 220 amino acids.
Monoclonal antibody drugs now form a very important class
of therapeutics and need to be measured in biomedical studies
and clinical research studies.
One of the most widely used monoclonal antibody drugs
today is infliximab, which is used
to treat autoimmune conditions such as Crohn's disease.
Infliximab binds to TNF alpha and has
a molecular weight of approximately 150,000 daltons.
Infliximab is known as a chimeric antibody.
Infliximab binds to TNF alpha.
And infliximab is a chimeric antibody.
So how do we analyze large proteins
of several thousand daltons using tandem quadrupole mass
spectrometers which usually have a limited mass range of less
than 2000 daltons.
The approach used is to break down the proteins
into smaller peptides using digestion with enzymes.
A number of different enzymes are used.
The most commonly used enzyme is trypsin,
which cleaves proteins in very specific locations.
Trypsin cleaves proteins adjacent to lysine
and arginine.
Cleavage is always on the c-terminal side
of the amino acid.
This means that peptides arising from trypsin digestion, which
are called triptych peptides, can
be predicted from the amino acid sequence of the protein.
Online software tools are available to predict
triptych peptides.
These online tools also predict the fragmentation
of those peptides in a mass spectrometer.
This is the basis of the surrogate peptide approach,
where a peptide or peptides are quantified
as a surrogate for the proteins from which
the peptides were derived from.
In some cases, proteins cannot be digested directly by enzymes
such as trypsin and require pretreatment prior
to digestion.
One example of this is treatment of disulfide bonds,
which are reduced and alkylated prior to digestion.
If the amino acid sequence of the triptych peptide
is unique to the protein from which it was derived from,
it is called the signature peptide.
The use of signature peptides means that the method
is more selective and specific.
Triptych peptides should contain between 8 and 20 amino acids.
In addition, a triptych peptide should not
contain amino acids that can be easily chemically modified,
such as cysteine and methionine.
The selection of triptych peptides
will be discussed in more detail in other modules
in this series.
Now that we've covered the basic structure of peptides
and proteins, and we've discussed
how peptides can be produced from proteins using
enzyme digestion, let's look at how peptides fragment
in a mass spectrometer.
This slide highlights some of the differences
between LC-MS of small molecules and LC-MS
of proteins and peptides.
One difference, which has already
been discussed in earlier slides,
is that peptides form multiply charged ions.
Doubly, triply, and even high charge peptide ions
are very common.
This is very different to small molecule LC-MS
where usually the precursor ion is singly charged.
Peptide fragments generated in a mass spectrometer collision
cell will have fewer charges then the precursor ions.
This means that peptide fragments that
have fewer charges will appear at a higher mass
to charge ratio than the precursor ions.
This is very different to what you
would see in the small molecule fragmentation
where the product ion is always at a lower mass to charge
ratio than the precursor ion.
Also, as we've seen before, peptides fragment
in a highly predictable manner along the amino acid chain.
Peptides can fragment at a number of predictable locations
in the peptide chain.
The nomenclature that result in fragment ions
depend on which bond has been broken.
When fragmentation occurs at the peptide bond,
the C-terminal fragments is called
the y ion and the N-terminal fragment is called the b ion.
Y and b ions are the most important
for quantification using mass spectrometry.
For triptych peptides, the y ion will always
have a lysine or arginine amino acid at the C-terminal end.
Fragmentation can also occur adjacent to the peptide bond,
leading to other ions which are called z, c, a, and x ions.
As we've already discussed, peptides
can produce a number of predictable fragment ions.
The selections that we're trying to use in an MRM experiment
need to be carefully considered.
In this example, fragmentation of the ion
at 523.2808 results in a number of fragment ions
shown in the lower half of the slide.
Which ones would be the best to use in an MRM method?
There are a number of potential fragment ions we could use.
There's the most intense ion at 239--
other ions at 341, 523, 873, 1045.
Let's evaluate these ions now.
The ions shaded in red, although intense,
may not be a good choice as these are all low masses
and could be prone to interference
from other peptides.
The ion at 1045 is the singly charged ion from 5232,
so it would not be utilized.
The y ion shown in the green shaded area at 873, 944, 802,
and 674 are all potentially usable
as they are of sufficient intensity and size.
This slide again highlights another feature
of peptide fragmentation in a mass spectrometer, which
is doubly charged ions fragmented
to singly charged ions, therefore resulting
in a product ion at a higher mass
to charge ratio than the precursor ion mass.
So we may not have access to standards
of all the potential triptych peptides we want
to develop MRM methods for.
However, there are software tools
such as Skyline which can predict fragmentation
of triptych peptides.
Tools such as Skyline's prediction
suggest fragment ions that can be used in LC-MS method
development.
These ions can be evaluated later by experiment.
This is very important, as it means that you do not
need to have access to standards of the triptych peptide
for initial method development.
So let's summarize what we learned
about peptide ionization or fragmentation.
Peptides form multiply charged ions,
which is very different to traditional small molecule
analysis.
Peptides fragment in a highly predictable manner
in the mass spectrometer, and these fragments
can be predicted using software tools.
The software tools also recommend
which MRM transition to use.
The resultant fragment ions, which are often y ions,
have a higher mass to charge than the precursor mass
to charge.
The MRM transitions that are finally used
are selected based on specificity and intensity
of the fragment ions.
So let's summarize some of the key points of this introduction
to peptides and protein structure.
Peptides and proteins are made of amino acids
and can form a variety of complex structures.
Small proteins and peptides can be analyzed directly, i.e.,
intact by tandem quadrupole LC-MS systems.
Larger proteins usually require digestion to smaller peptides
for quantification by tandem quadrupole LC-MS systems.
Enzymatic cleavage sites are predictable,
and software tools are available that
can predict triptych peptides.
The structure of peptides and protein
impacts all stages of the bioanalysis workflow.
This slide shows the workflow for the surrogate workflow
approach, where a protein is enzymatically digested
by trypsin to produce signature or unique peptides.
The process starts with selecting
unique peptides which represent the protein we
are trying to measure.
These unique peptides are predicted by software tools.
The best MRM transitions are then selected and optimized.
We then go through the process of optimizing
some for preparation, which may involve clean-up at the protein
level, reduction of colation, digestion, and peptide level
clean-up.
The MRM transitions, may then need to be fine
tuned using peptides generated in a biological matrix.
The structure of peptides and protein is an important factor
and needs to be considered in all of the above steps.
This presentation was designed to introduce peptide
and protein structure and how the structure of peptide
and protein influences LC-MS method development.
Further information is available on a variety of web based
resources, including these.
Thank you for listening.