ABSTRACTWhy do eukaryotic membrane proteins express so poorly in E. coli? This issue has created a significantbarrier in the effort to study mammalian membrane protein structures. E. coli is the tried-and-true over-expression system for protein purification, which has made structural biology accessible to countlesslaboratories. Still, there are only a handful of eukaryotic membrane protein structures determined so far andthe major reason for this is the lack of an economical and easy expression system like E. coli. If expressioncould be carried out in E. coli, then this would improve our ability to investigate mammalian membrane proteinstructures, especially in light of recent revolutionary developments in single-particle cryo-electron microscopy.There is ample evidence that eukaryotic membrane proteins do express in E. coli, but that in most cases theyields are incredibly low and it is uncertain if the protein is functionally folded. However, by tagging targets withGFP at the C-terminus, it is possible to observe single-molecule protein expression directly in E. coli byoblique-angle fluorescence microscopy using sensitive EM/CCD detectors. With this approach, we candetermine if expression occurs in the membrane vs. the cytoplasm or inclusion bodies based on single-particletracking to measure diffusion. Furthermore, isolation of vesicles from E. coli membranes allows for single-vesicle functional measurements even with low levels of expression. Thus, it is possible to study theremarkably low-levels of expression of eukaryotic membrane proteins in E. coli and interrogate whether theproduction of functionally folded protein can be optimized through genetic manipulation. One reason whyeukaryotic sequences may fail is due to improper coding of co-translational folding, i.e. a hidden genetic codethat couples the timing of translation with partitioning and folding in the lipid bilayer. On target systems, we willinvestigate changes in expression and function while comparing: (a) codon usage including E. coli optimizedvs. native codons and conservation of rare codon clusters, (b) N-terminal protein sequences and (c)conservation of pause sites such as Shine-Dalgarno elements (prokaryotes) or Alu motifs (eukaryotes). Foreach of these variables, we will generate chimaeras between homologues that express and those that fail inorder to identify which elements lead to successful expression. We will examine three distinct membraneprotein families that already have structures for both prokaryotic and eukaryotic homologues: (i) the CLC familyof Cl-/H+ transporters and Cl- channels, (ii) Aquaporin water channels and (iii) 7TM receptor family ofmembrane proteins, including GPCRs. Finally, we will design a standalone program that will allow for simplealignment of gene sequence, protein sequence and structural elements simultaneously. The end goal of thisproject is to develop an optimization algorithm that will allow any scientist to take a poorly expressingeukaryotic membrane protein in E. coli, and increase expression to yields that will facilitate biochemistrystudies such as structure determination by cryo-EM.