Molecular Cartography

Wiki-like knowledge base for more than 1,000 synaptic genes debuts as online portal

illustration of synapse

The choreography of communication between nerve cells in the brain is exquisitely controlled by synapses, the specialized junctions that pass electrical or chemical signals from one neuron to another. Although synapses have been studied for many decades, much remains unknown.

Now, an international consortium of researchers has released a database of structured frameworks that describe current knowledge about the genetic characteristics of the synapse. The structured frameworks, known as ontologies, are human- and machine-readable computational models describing 1,112 unique genes that regulate synaptic structure and function.

Get more HMS news here

The group is collectively known as SYNGO and includes 15 laboratories, which worked in collaboration with the Gene Ontology Consortium. The database the group developed ( is version one of the knowledge base. The consortium was established in 2015 by Steven Hyman and Guoping Feng of the Stanley Center for Psychiatric Research at the Broad Institute of MIT and Harvard; it is coordinated by Guus Smit and Matthijs Verhage of the Center for Neurogenomics and Cognitive Research at VU University in Amsterdam.

In June, the group published in Neuron a report of noteworthy features and disease associations of the genes. One of the paper’s authors is Pascal Kaeser, an associate professor of neurobiology in the Blavatnik Institute at Harvard Medical School. HMNews talked with Kaeser about the database and the hopes for its use.

HMNews: This is an amazingly massive project. What was its genesis?

Pascal Kaeser: Both the founders of the consortium and researchers in the field at large knew there was a need for a good “parts list” for synapses.

If you want to know how a car works, you first want to know about the parts of the car. Even though the synapse has been studied for many years, all of the parts lists we had for them had serious limitations in terms of validity or quality control, for example. I think that the idea behind this effort was basically to bring a large number of experts together and agree on how we would determine and describe the parts, how we would present that information, and how we would make the information available to everyone in a format that would allow it to grow.

I was brought on board two or three years ago to work on a specific component of the project. The contribution of my lab was to annotate and post a little more than 50 genes.

HMN: That sounds like a lot.

PK: It was a lot of work. For the entire project, I think there’s something like three thousand annotations for almost 1,200 genes, so there was a large number of labs that did a good amount of work. The work was hard but it also was inspiring.

HMN: Does this project fall into the category of an omics study? Genomics, proteomics, metabolomics—so many fields are gathering large amounts of data to characterize their particular set of molecules. Is this an omics effort for synaptic genes?

PK: I don’t think it falls into that category. Instead, it serves omics work. The idea behind this is that if somebody comes up with an omics study of some sort, be it a proteomic study or something that relates to the disease, they can merge their dataset with the dataset of a synapse, one that’s annotated by experts, and ask “how much does the synaptic proteome, for example, contribute to our process?” The idea is to make omics studies more interpretable by having a good database for quality checking, validation, and decision making.

HMN: Basically, a reference manual.

PK: That’s the idea.

HMN: I know you’ve described this as a parts list for synapses, but when reading the paper, you could interpret the findings almost as personality descriptions of the synaptic genes—how exceptionally sensitive they are to mutational change, how they have held on to their characteristics over millennia, and how they are associated not only with intelligence but also with such neurobehavioral disorders as ADHD, biopolar disorder and schizophrenia.

PK: I think what you felt reflects the fact that these genes are more than just a protein or a protein name to those who worked on the database. I think it also reflects our hope that the openness of the database will encourage others to contribute to it. The web portal allows anybody who finds something interesting in the literature or has something from their lab that’s been or is about to be published to enter it into the database. Then, experts will do quality control on the annotations, ensuring their quality and validity. This will only be useful if in 10 years’ time it’s not simply a 10-year-old database.

HMN: Did you encounter anything surprising about the findings? Did it surprise you to learn, for instance, that synaptic genes are longer than perhaps any gene we’ve explored to date?

PK: I wouldn’t say I found individual findings surprising. I would say I kind of found it all—encouraging.

The synapse is a phenomenal kind of machine of molecules that span three cellular compartments: the cell that sends the signal, the space between the cells, and the cell that receives the signal. It’s a sort of merger of two cells and the space in between. That structure and function make it more complex than your typical cellular compartment. The synapse is a very complicated connection between two cells that is incredibly fast and highly specialized.

The database and the findings reflect the complexity of the synapse while also validating the proteins associated with those complex functions.

HMN: It is apparent that for synaptic biologists and other researchers this is an exciting tool for further discovery. Yet, the paper also mentions disease applications. Is the database something that could be directly applied in clinic?

PK: Definitely. One of the goals of this project was to produce a database accessible to clinicians. If a doctor, a geneticist, identifies a new mutation or a new gene, that person can go to the database to determine if the found gene is a synaptic gene. Then, when clinicians identify new mutations, the database will give them an immediate framework. The data even contain the name of annotators, so the clinician can write the person who annotated the gene, the person who has essentially said “I put my name on this. This is the best we know about the gene.”

People who work at identifying which genes cause which diseases often try to subclassify which “hubs” of biology relate to the diseases. Synapses are one of the hubs that jump out in many neurological disorders. If a clinician or researcher suspects that they are trying to characterize a synaptic disorder, they can use the database for reference and to test against. You can feed entire lists of genes into the portal and then ask how they compare. The paper tested this with a few available lists of genes and determined, for example, that certain autism spectrum-related disorders are related to certain aspects of synaptic biology.

Typically, when, a paper on schizophrenia, for instance, is published, a researcher will check whether their three favorite genes are in the paper. Often they are. So the researcher gets to say, “yes, my genes are really important.” But that’s a biased way of validating a gene’s function. Having this resource is encouraging to me for it will help us decide if what we’re working on makes sense, it will help us check off some of the thoughts, predictions and suspicions we had all along the way but could never really put a finger on.

I think it’s important that people use this resource. It’s vital that it continues to grow. It was built to be inviting, an inviting resource for connecting datasets, scientists and clinicians worldwide.

Image: iStock/koto_feja