Mining Enzyme Diversity of Transcriptome ... - ACS Publications


Mining Enzyme Diversity of Transcriptome...

3 downloads 68 Views 2MB Size

Subscriber access provided by La Trobe University Library

Article

Mining enzyme diversity of transcriptome libraries through DNA synthesis for benzylisoquinoline alkaloid pathway optimization in yeast Lauren Narcross, Leanne Bourgeois, Elena Fossati, Euan Burton, and Vincent J. J. Martin ACS Synth. Biol., Just Accepted Manuscript • DOI: 10.1021/acssynbio.6b00119 • Publication Date (Web): 21 Jul 2016 Downloaded from http://pubs.acs.org on July 22, 2016

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

ACS Synthetic Biology is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 38

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

1

Mining enzyme diversity of transcriptome libraries through DNA synthesis for

2

benzylisoquinoline alkaloid pathway optimization in yeast

3 4

Lauren Narcross1,2, Leanne Bourgeois1,2, Elena Fossati, Euan Burton1,2, Vincent J.J. Martin1,2*

5 6

1

Department of Biology, Concordia University, Montréal, Québec, Canada, H4B 1R6

7 2Centre for Structural and Functional Genomics, Concordia University, Montréal, Québec, Canada, H4B 1R6

8 9 10 11 12 13 14 15 16 17 18 19 20 21 22

* Corresponding author:

23

E-mail: [email protected]

ACS Paragon Plus Environment

ACS Synthetic Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

1

Page 2 of 38

Abstract

2

The ever-increasing quantity of data deposited to GenBank is a valuable resource for

3

mining new enzyme activities. Falling costs of DNA synthesis enables metabolic engineers to

4

take advantage of this resource for identifying superior or novel enzymes for pathway

5

optimization.

6

dihydrosanguinarine in yeast from norlaudanosoline at a molar conversion of 1.5%. Molar

7

conversion could be improved by reduction of the side-product N-methylcheilanthifoline, a key

8

bottleneck in dihydrosanguinarine biosynthesis. Two pathway enzymes, an N-methyltransferase

9

and a cytochrome P450 of the CYP719A subfamily, were implicated in the synthesis of the side-

10

product. Here, we conducted an extensive screen to identify enzyme homologs whose co-

11

expression reduces side-product synthesis. Phylogenetic trees were generated from multiple

12

sources of sequence data to identify a library of candidate enzymes that were purchased codon-

13

optimized and pre-cloned into expression vectors designed to facilitate high-throughput analysis

14

of gene expression as well as activity assay. Simple in vivo assays were sufficient to guide the

15

selection of superior enzyme homologs that ablated the synthesis of the side-product, and

16

improved molar conversion of norlaudanosoline to dihydrosanguinarine to 10%.

Previously,

we

reported

synthesis

of

the

benzylisoquinoline

alkaloid

17 18

Keywords

19

Synthetic DNA; transcriptome mining; benzylisoquinoline alkaloids; dihydrosanguinarine;

20

Saccharomyces cerevisiae; pathway optimization

2

ACS Paragon Plus Environment

Page 3 of 38

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

1

ACS Synthetic Biology

Introduction

2

A vast amount of genome and transcriptome data is deposited in publicly available

3

resources such as GenBank, which reached a milestone of one trillion base pairs of sequence data

4

in January 2015.1 More targeted databases like the Thousand Plant2 and PhytoMetaSyn Projects3-

5

4

6

evolutionary analysis but have traditionally presented few opportunities for metabolic engineers

7

due to the lack of physical DNA available to them.5 Until recently, the RNA used to generate

8

transcriptome sequence databases was also the source of cDNA used for the targeted

9

amplification of putative ORFs and gene discovery.6-13 With the cost of DNA synthesis falling

10

from $1/bp in 2006 to $0.12/bp in 2014,14 digital sources of DNA sequences are becoming

11

broadly-accessible primary resources of unique enzymes for the purposes of pathway

12

optimization and the identification of novel activities. This information represents an attractively

13

simple alternative to more traditional methods of pathway optimization through protein

14

engineering approaches such as directed evolution15-16 or rational modification.17-18 For example,

15

heterologous synthesis of methyl halides was enabled through the screening of 89 putative

16

methyl halide transferases from metagenomics data deposited to NCBI.

17

enzyme bottleneck in the heterologous synthesis of coumarate was alleviated through the

18

screening of a library of both putative and published enzymes purchased entirely from

19

GenBank.20

provide further sources of sequence information. Such in silico resources are valuable for

19

More recently, an

20

Previously, we reported the reconstitution in Saccharomyces cerevisiae of a 10-gene

21

pathway for the synthesis of the benzylisoquinoline alkaloid (BIA) dihydrosanguinarine – a

22

reduced form of the antimicrobial sanguinarine – from the precursor norlaudanosoline (Figure

23

1A).21 De novo synthesis of benzylisoquinolines in yeast is currently at the microgram/liter level

3

ACS Paragon Plus Environment

ACS Synthetic Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 4 of 38

1

due to low precursor titers and poor performance of the enzyme catalyzing the committed step,

2

22-24

3

When supplemented to culture medium, the highest-reported molar conversion of the substrate

4

norlaudanosoline to the key branch point reticuline in yeast is 20%.21,

5

expression of the 7 enzymes necessary for dihydrosanguinarine synthesis from reticuline drops

6

molar conversion to 1.5%21, with the accumulation of the dead-end intermediate N-

7

methylcheilanthifoline contributing to the drop in yield (Figure 1A).21

although the conversion of norlaudanosoline to downstream products can also be inefficient.

24

The additional co-

8

Conversion of the dihydrosanguinarine pathway intermediates scoulerine to stylopine is

9

catalyzed by two cytochrome P450 enzymes of the CYP719A subfamily: cheilanthifoline

10

synthase (CFS) converts scoulerine to cheilanthifoline, and stylopine synthase (SPS) converts

11

cheilanthifoline to stylopine (Figure 1A). However, in in vitro and heterologous in vivo systems,

12

stylopine synthase activity is insufficient and leads to cheilanthifoline accumulation.21, 25 Since

13

cheilanthifoline

14

tetrahydroprotoberberine N-methyltransferase (TNMT), accumulation of cheilanthifoline results

15

in the synthesis of the non-productive intermediate N-methylcheilanthifoline.

is

also

a

substrate

for

the

promiscuous

N-methylating

enzyme

16

Enzyme promiscuity leading to non-productive intermediates is a common problem in

17

heterologous pathway reconstitution.24, 26-30 Many strategies for reducing side-reactions, such as

18

compartmentalization of competing reactions or enzyme engineering for improved specificity,

19

work within the constraints of enzymes currently in use.24,

20

dihydrosanguinarine pathway, we demonstrate the power of mining transcriptome libraries

21

combined with gene synthesis as an effective strategy for pathway engineering. We postulated

22

that either an SPS that is able to outcompete TNMT for cheilanthifoline, or a TNMT with

4

ACS Paragon Plus Environment

31-34

Here, using the

Page 5 of 38

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

1

narrower substrate selectivity, or both, would prevent N-methylcheilanthifoline synthesis and

2

greatly improve current dihydrosanguinarine yields.

3

In this work, two enzyme libraries, one of TNMTs and one of CYP719s were purchased

4

as codon-optimized synthetic genes and screened individually and in combinations. In assaying

5

these 73 enzymes, a new activity was discovered, which inspired a new route to stylopine

6

synthesis (Figure 1B). Consequently, synthesis of N-methylcheilanthifoline was ablated. The

7

newly engineered dihydrosanguinarine pathway now reaches 10% conversion in yeast cultures

8

supplemented with the precursor norlaudanosoline. The strategy described here is a simple

9

alternative to more rational methods that can be applied to any pathway that requires

10

optimization.

11 12

Results

13

Generation of CYP719 and NMT enzyme libraries

14

Synthesis of the BIA dihydrosanguinarine from norlaudanosoline requires nine enzymatic

15

reactions (Figure 1A). Two of these are catalyzed by N-methyltransferases: conversion of 6-O-

16

methylnorlaudanosoline to 3’-hydroxy-N-methylcoclaurine by coclaurine N-methyltransferase

17

(CNMT), and conversion of stylopine to N-methylstylopine by tetrahydroprotoberberine N-

18

methyltransferase (TNMT). NMTs from BIA-producing plants can accept a variety of BIAs as

19

substrates (Table 1). Nevertheless, NMTs have also been demonstrated to differentiate between

20

BIAs that differ by a single methyl group or methylenedioxy bridge.7 Thus, one approach to

21

reducing N-methylcheilanthifoline synthesis was to identify a TNMT that accepted stylopine but

22

not cheilanthifoline. CNMTs were also considered, as some can N-methylate downstream

23

dihydrosanguinarine pathway intermediates (Table 1). The reverse-translated PhytoMetaSyn

5

ACS Paragon Plus Environment

ACS Synthetic Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

1

transcriptome database was queried using a conserved TNMT/CNMT motif. Putative ORFs

2

selected from the transcriptome database were aligned with published TNMTs/CNMTs using

3

MUSCLE and a phylogenetic tree was generated with the program MEGA6 (Figure 2A).35 The

4

phylogenetic tree served as a guide for the choice of enzyme candidates to be screened. A total of

5

15 published and putative NMTs were purchased.

6

Conversion of the dihydrosanguinarine pathway intermediate scoulerine to stylopine

7

requires the formation of two methylenedioxy bridges, indicated by “A” and “B” in Figure 1B.

8

While theoretically the reactions could occur in either order, it has been experimentally

9

determined in planta that Ring B closure (catalyzed by CFS) occurs before Ring A closure

10

(catalyzed by SPS).36 Both CFS and SPS are cytochrome P450s in the CYP719A subfamily. This

11

subfamily also includes other members that catalyze methylenedioxy bridge formations on other

12

BIAs and other alkaloids (Table 2), and still other methylenedioxy bridge-containing alkaloids

13

have been identified for which the appropriate methylenedioxy bridge-forming enzymes are still

14

unknown. Diversity amongst methylenedioxy-bridge containing alkaloids and CYP719 substrate

15

acceptance profiles suggests that the CYP719 enzyme family is extensive and may include SPS

16

enzyme homologs that are more appropriate for heterologous reconstitution of the

17

dihydrosanguinarine pathway. Traditionally, the naming scheme for CYP719s is based on an

18

identified product (i.e. stylopine synthase and cheilanthifoline synthase). However, this naming

19

scheme becomes untenable when the same enzyme can synthesize multiple products. Here, we

20

refer to CYP719s by the location of methylenedioxy bridge formation: Ring A-closing CYP719s

21

and Ring B-closing CYP719s.

22

Reverse-translated transcriptome data from the PhytoMetaSyn database was queried for a

23

conserved heme-binding cytochrome P450 motif and an N-terminal motif conserved amongst

6

ACS Paragon Plus Environment

Page 6 of 38

Page 7 of 38

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

1

published CYP719s. Queries were narrowed down using BLASTclust. BLASTclust sorts

2

sequences into groups using the criterion of percent sequence identity, which is convenient for

3

the study of CYPs because CYP families and subfamilies are defined based on this criterion

4

(45% amino acid identity defines a family, 55% amino acid identity defines a subfamily).37

5

Stringency was set to 50% in order to include the CYP719B subfamily, which also has activity

6

on BIAs.38 Putative CYP719s that clustered with published CYP719s were aligned with

7

MUSCLE, and a phylogenetic tree was generated using MEGA6 (Figure 2B).35 Three clades

8

were observed, which were assigned predicted activities based on the co-alignment with

9

characterized CYP719s: Ring A-closing CYP719s, further segregated into two subclades of

10

CYP719s predicted to act on cheilanthifoline (stylopine synthases) or on the BIA

11

tetrahydrocolumbamine (canadine synthases); Ring B-closing CYP719s, further segregated into

12

one subclade of cheilanthifoline synthases and one subclade of CYP719Bs; and CYP719s with

13

unknown activities. A total of 54 characterized and putative CYP719s were purchased for

14

screening.

15 16

Selection of replacement Ring A-closing CYP719s

17

The library of CYP719s included enzymes with characterized activity on relevant BIAs

18

(scoulerine and cheilanthifoline), other BIAs or other alkaloids, as well as enzymes with

19

predicted activity or no predicted activity (Table 2, Figure 2B). Following a qualitative

20

assessment of CYP719 expression through comparison of the fluorescence of CYP719-GFP

21

fusion proteins (see Supporting Results & Discussion), an initial activity screen was performed

22

to validate predicted Ring A- and Ring B-closing activities within the CYP719 library. CYP719s

23

were expressed in yeast and supplemented with the dihydrosanguinarine pathway intermediate

7

ACS Paragon Plus Environment

ACS Synthetic Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 8 of 38

1

scoulerine. Scoulerine is a substrate for both Ring A- and Ring B-closing CYP719s, forming

2

nandinine

3

cheilanthifoline have the same mass and similar structures, they are distinguishable by HPLC-

4

MS both through elution time and by MS/MS profile (Figure 3B). Analysis of nandinine and

5

cheilanthifoline synthesis confirmed that predicted Ring A- and Ring B-closing activities were

6

generally accurate with no cases of Ring B-closure where Ring A-closure was predicted, or vice

7

versa (Figures 3C, 3D). In addition, no conversion of scoulerine was detected for predicted

8

CYP719Bs, CYP719s with characterized activity on non-BIAs, or putative CYP719s with no

9

predicted activity. Scoulerine was widely accepted amongst Ring B-closing CYP719s with 10 of

10

12 candidates converting >95% of the scoulerine to cheilanthifoline (Figures 3C, 3D).

11

Conversely, a greater range of nandinine synthesis was observed amongst predicted Ring A-

12

closing CYP719s. Of these, 7 of 18 predicted stylopine synthases and 3 of 16 predicted canadine

13

synthases converted >95% of the scoulerine to nandinine (Figures 3C, 3D). These Ring A-

14

closing CYP719s were considered for further characterization.

and

cheilanthifoline,

respectively

(Figure

3A).

Although

nandinine

and

15

Next, selected Ring A-closing CYP719s were co-expressed with the Ring B-closing

16

CYP719 PsCFS, previously used in the heterologous reconstitution of dihydrosanguinarine

17

synthesis.21, 39 When supplemented with scoulerine, the expected product is stylopine (Figure

18

4A). In each yeast strain, scoulerine was entirely consumed but different proportions of

19

cheilanthifoline, nandinine, and stylopine were observed depending on the co-expressed Ring A-

20

closing CYP719 (Figure 4A). Expression of Ring A-closing CYP719s predicted to be canadine

21

synthases resulted in residual cheilanthifoline. These candidates were not considered for further

22

screening. With one exception, cheilanthifoline was not detected when the Ring A-closing

23

CYP719 was predicted to be a stylopine synthase.

8

ACS Paragon Plus Environment

Page 9 of 38

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

1

Unlike previously reported combinations of cheilanthifoline and stylopine synthases,

2

nandinine was a product observed in each combination of Ring A-closing CYP719s and

3

PsCFS.21,

4

PsCFS for scoulerine. Accumulation of nandinine also indicated that it is not a preferred

5

substrate of PsCFS. Improved activity of Ring A-closing CYP719s relative to PsCFS is a desired

6

quality, but the generation of a new side product is not. Nandinine accumulation could be

7

avoided by limiting the pool of potential SPS’s to those with activity on cheilanthifoline but not

8

scoulerine. Because many Ring A-closing CYP719s in the library can accept scoulerine, this is

9

not an ideal limitation. Alternatively, if a Ring B-closing CYP719 could be identified that also

10

accepts nandinine as substrate, then nandinine could be re-captured into the main pathway,

11

shifting from a side-product to a pathway intermediate (Figure 1B).

25, 39

Nandinine synthesis resulted from Ring A-closing CYP719s out-competing

12 13

Engineering of a non-natural stylopine synthesis pathway

14

To simplify further assessment of Ring A- and Ring B-closing CYP719 activity, the

15

appropriate substrates were supplied directly to yeast strains. As cheilanthifoline and nandinine

16

were not available commercially, they were generated from scoulerine by incubation with yeast

17

expressing an appropriate CYP719. Supernatant containing cheilanthifoline was used as

18

substrate to test the activity of Ring A-closing CYP719s (Figure 4B), while supernatant

19

containing nandinine was used to test Ring B-closing CYP719s (Figure 4C). As was suggested in

20

co-expression analysis, all selected Ring A-closing CYP719s were able to convert >95% of

21

cheilanthifoline to stylopine. Many Ring B-closing CYP719s had some activity on nandinine,

22

with 2 of 10 candidates (Sdi-1 and Cma-2) converting >95% of supplemented nandinine to

23

stylopine. Acceptance of nandinine by the two Ring B-closing CYP719s enables the non-natural

9

ACS Paragon Plus Environment

ACS Synthetic Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

1

“Ring A first” pathway for stylopine synthesis (Figure 1B). Hence, the six Ring A-closing

2

CYP719s and two Ring B-closing CYP719s that converted >95% of their supplemented BIA to

3

stylopine were selected for combinatorial testing in the presence of TNMT.

Page 10 of 38

4 5

Combinatorial testing of Ring A- and Ring B-closing CYP719s

6

Combinations of Ring A- and Ring B-closing CYP719s were next co-expressed in the

7

presence and absence of TNMT to measure production of downstream N-methylated BIAs

8

(Figure 5). When supplemented with scoulerine, stylopine should be the product in the absence

9

of TNMT and N-methylstylopine should be the product in the presence of TNMT. Any

10

accumulated cheilanthifoline, nandinine, or their N-methylated derivatives, would indicate an

11

undesired combination of CYP719s. Nandinine was observed during co-expression of PsCFS

12

with Ring A-closing CYP719s, but not with either of the two selected Ring B-closing CYP719s.

13

Residual cheilanthifoline (and N-methylcheilanthifoline in the presence of TNMT) was observed

14

in samples expressing 2 of the 6 Ring A-closing CYP719s, which was not expected because

15

these enzymes previously converted >95% of cheilanthifoline to stylopine (Figure 4B). Between

16

experiments, Ring A-closing CYP719s had been placed under the control of a new promoter /

17

terminator pair in order to allow homology-mediated cloning of a double CYP719 gene cassette.

18

The other 4 Ring A-closing CYP719s, when co-expressed with either of the 2 Ring B-closing

19

CYP719s, resulted in >95% conversion of scoulerine to stylopine in the absence of TNMT, and

20

>95% conversion of scoulerine to N-methylstylopine in the presence of TNMT. These

21

combinations were selected for integration into the dihydrosanguinarine pathway.

22 23

Selection of replacement TNMT

10

ACS Paragon Plus Environment

Page 11 of 38

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

1

Before retrofitting a final dihydrosanguinarine production strain with the newly

2

discovered CYP719s, the NMT library was screened for NMTs with greater substrate preference

3

for stylopine. Yeast strains harboring the NMT library were supplemented with scoulerine

4

(Figure 6A) and stylopine (Figure 6B) in order to assess relevant substrate acceptance profiles. In

5

addition, the activity of NMTs on cheilanthifoline was also assayed through the supplementation

6

of scoulerine to yeast strains co-expressing the NMT library and the Ring B-closing CYP719

7

PsCFS (Figure 6C). Included, as negative controls, were two O-methyltransferases (OMTs) in

8

the dihydrosanguinarine pathway that were not expected to methylate the supplemented BIAs.40

9

In general, if an NMT N-methylated stylopine, it also N-methylated cheilanthifoline and

10

scoulerine (Figure 6). These NMTs all aligned with published TNMTs, but not every enzyme

11

aligning with TNMTs had activity on the BIAs tested here. Amongst CNMTs, PsCNMT is

12

uniquely able to N-methylate scoulerine and cheilanthifoline, but no CNMT was able to N-

13

methylate stylopine. Furthermore, when co-expressed with PsCFS, NMTs with no measurable

14

activity on the BIAs tested appeared to interfere with cheilanthifoline synthesis, as conversion of

15

scoulerine to cheilanthifoline was lower than either empty vector or OMT control strains. Since

16

we did not identify an NMT that methylated stylopine without also methylating scoulerine

17

and/or cheilanthifoline, the strategy of identifying an NMT with a different substrate acceptance

18

profile was not pursued further.

19 20

Retrofitting and testing an optimized dihydrosanguinarine producing strain

21

Through CRISPR-directed homologous recombination, most of the norlaudanosoline-to-

22

dihydrosanguinarine pathway was chromosomally integrated into a single yeast strain. Missing

23

were the CYP719 Ring A- and Ring B-closing enzymes, which were combinatorially co-

11

ACS Paragon Plus Environment

ACS Synthetic Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 12 of 38

1

expressed from plasmids. Cultures of the resulting strains were then supplemented with

2

norlaudanosoline at increasing concentrations, and levels of N-methylated intermediates and

3

dihydrosanguinarine were measured. In the absence of CYP719s, N-methylscoulerine was

4

produced as a result of the accumulation of scoulerine (Figure 7A). Normalized N-

5

methylscoulerine levels remained constant across increasing substrate concentrations, which

6

indicated that there were no measurable bottlenecks in the pathway from norlaudanosoline to

7

scoulerine. N-methylscoulerine was not observed when CYP719s were expressed. As expected,

8

BIAs extracted from yeast cultures expressing the original CYP719 combination (PsSPS and

9

PsCFS) showed N-methylcheilanthifoline accumulation at every norlaudanosoline concentration.

10

Further, as norlaudanosoline concentration increased, N-methylcheilanthifoline levels rose

11

relative to normalized values. In contrast, N-methylcheilanthifoline did not accumulate with any

12

combination of the newly selected CYP719s, at any concentration of norlaudanosoline. When the

13

strain co-expressing PsSPS and PsCFS was supplemented with 10 µM norlaudanosoline, 5%

14

was converted to dihydrosanguinarine and sanguinarine (Figure 7B). Multiple combinations of

15

Ring A- and Ring B-closing CYP719s resulted in improved levels of dihydrosanguinarine and

16

sanguinarine, reaching ~10% conversion of norlaudanosoline to dihydrosanguinarine with no

17

intermediate or side product observed. At higher concentrations of norlaudanosoline, the

18

pathway intermediate N-methylstylopine accumulated, increasing relative to normalized

19

norlaudanosoline conversion values.

20 21 22 23

Discussion With the cost of next-generation DNA sequencing less than $1/million base pairs,14 transcriptome databases can be cheaply generated and used for enzyme discovery.6,

12

ACS Paragon Plus Environment

8, 41-43

Page 13 of 38

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

1

Historically, harnessing the power of transcriptome databases was difficult without access to the

2

physical RNA used to generate the libraries. Now, advances in DNA synthesis technologies44

3

and molecular biology techniques for efficient heterologous gene expression45-48 can accelerate

4

and improve the enzyme discovery process. Here, we demonstrate the power of combined

5

accessibility to transcriptome data and affordable gene synthesis for pathway engineering and

6

optimization. Not only did our synthetic gene library contain multiple enzymes capable of

7

improving the pathway, it also included a novel activity enabling pathway redesign to prevent

8

the synthesis of side products. As the cost of DNA synthesis continues to drop, we foresee this

9

strategy becoming increasingly common for pathway engineering and optimization.

10

The work presented here is an example of the power and limits of predictive search for

11

enzymatic activities from sequence data. Predicted activities for putative ORFs were assigned

12

based on co-alignment of characterized enzymes within clades. For both NMTs and CYP719s,

13

broad activities were accurate (i.e. N-methylase activity or Ring A- vs. Ring B-closure), but

14

enzymes co-aligned within a sub-clade did not necessarily have the same substrate acceptance

15

profiles. This is not an uncommon phenomenon,6, 49-50 which highlights the value in using large

16

libraries of orthologous enzymes to increase the chances of finding an activity of interest.

17

For both published NMTs and CYP719s, our analysis of substrate preferences included

18

more positive hits than previous characterization. For instance, TNMTs have been demonstrated

19

to have much lower activity on scoulerine than stylopine (0-10% of relative activity depending

20

on the homolog),7, 51-53 whereas no difference was observed here (Figures 6A, 6C). Similarly,

21

some CYP719s such as CYP719A1 have been shown to have little to no activity on scoulerine54

22

but here were able to convert >95% of supplemented scoulerine to nandinine. Unlike traditional

23

biochemical assays, supplementation and bioconversion assays often occur over a longer

13

ACS Paragon Plus Environment

ACS Synthetic Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

1

duration (1 hr vs. >16 hrs) and are a measure of reaction progression, not reaction speed.55-56 For

2

the identification of enzymes with novel activities for heterologous pathway engineering, longer

3

incubation times in in vivo conditions may flag candidates that may otherwise be discarded.

Page 14 of 38

4

In this work, all CYP719s were co-expressed with a single cytochrome P450 reductase

5

(CPR) from P. somniferum (PsCPR). This setup was sufficient to identify both Ring A- and Ring

6

B-closing CYP719s from multiple plant species with activity on supplemented BIAs. However,

7

PsCPR may not be an ideal partner for all the CYPs in the library. It has long been recognized

8

that any one CPR cannot support the activity of all CYPs.57 While a CPR from the same species

9

is often used if available,58-60 the existence of multiple CPR in plants complicates the selection

10

process.61 Further, the actual relationship between CYP and CPR is unpredictable, with CPRs

11

from other plants like Arabidopsis thaliana38-39, 62 and even the native yeast CPR22 supporting

12

heterologous activity of some CYPs. Therefore we presume that combinatorial co-expression of

13

the CYP719 library with a putative CPR library from other organisms may improve the activity

14

of some of the CYPs in our library.

15

Enzyme promiscuity presents serious challenges for the reconstitution of heterologous

16

pathways,24, 34, 63 which can compound as a pathway increases in size. For example, scoulerine is

17

a substrate for 4 of 9 enzymatic steps in the dihydrosanguinarine pathway (Tables 1 and 2).

18

Nevertheless, in vivo combinatorial screens identified multiple enzyme combinations in which no

19

scoulerine side-products were found to accumulate. Initially, nandinine synthesis was an

20

undesired activity. Although nandinine acceptance by Ring B-closing CYP719s had not been

21

characterized (Table 2), and although published Ring B-closing CYP719s had little activity on

22

nandinine when assayed here, our enzyme library contained CYP719s that could synthesize

23

stylopine from nandinine. Thus, a side-product generated through enzyme promiscuity was re-

14

ACS Paragon Plus Environment

Page 15 of 38

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

1

introduced back into the pathway through the activity of a second promiscuous enzyme. Using

2

complementary substrate acceptance profiles to achieve a single product of interest is still new,34

3

although we expect this approach will become more prominent as large enzyme libraries become

4

increasingly accessible through gene synthesis.

5

The conversion of norlaudanosoline to dihydrosanguinarine in this work compares

6

favorably to other reports of heterologous dihydrosanguinarine/sanguinarine synthesis.

7

Norlaudanosoline supplementation in this work is lower than the system published by Trenchard

8

et al. (10 µM vs. 2 mM), but yield is higher (10% conversion to dihydrosanguinarine and

9

sanguinarine vs. 0.012% conversion to sanguinarine).39 Following our identification of a

10

bottleneck at SPS,21 Trenchard et al. performed a 2x2 combinatorial search for a new CFS and

11

SPS, ultimately selecting CYP719A5 and CYP719A2, respectively.39 While both enzymes were

12

included in our screen, neither enzyme was ultimately selected; CYP719A5 did not have

13

sufficient activity on nandinine, and CYP719A2 did not have sufficient activity on scoulerine.

14

CYP719A2 also displays low fluorescence as a GFP fusion protein, whereas enzymes selected

15

for integration into the dihydrosanguinarine pathway tended to display the highest levels of

16

fluorescence amongst enzymes with any particular desired activity (see Supporting Results and

17

Discussion).

18

While the system here represents a 10-fold improvement over previous work,21 the

19

pathway can be improved further. Conversion of norlaudanosoline to dihydrosanguinarine was

20

10% at 10 µM norlaudanosoline. However, as supplemented norlaudanosoline concentrations

21

increased, so did buildup of the intermediate N-methylstylopine. This points to the downstream

22

enzyme N-methylstylopine hydroxylase (MSH), a member of the CYP82 family, as the next

23

target for improvement. A single MSH has been identified thus far,64 but other members of the

15

ACS Paragon Plus Environment

ACS Synthetic Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

1

same family have been demonstrated to have activity on BIAs.65-66 The strategy of enzyme

2

selection and screening employed here may be able to improve this next step as well.

3

Commonly-cited strategies for heterologous pathway optimization include control of

4

enzyme transcription and translation, as well as spatial control of enzymes through the use of

5

scaffolds or targeting signals67. The work presented here adds to the growing number of studies

6

that highlight the potential of transcriptome libraries to provide new solutions to these problems.

7

As the cost of DNA synthesis continues to drop, the screening of enzyme homolog libraries

8

should be considered an integral part of pathway engineering.

Page 16 of 38

9 10

Materials and Methods

11

Transcriptome data analysis

12

The PhytoMetaSyn database (www.phytometasyn.ca) of assembled transcriptome data

13

from BIA-producing plants3 was reverse-translated into putative ORFs using OrfPredictor.68

14

Translated ORFs were scanned for motifs of interest generated from sequence alignments of

15

published proteins and candidate sequences kindly provided by Dr. Peter Facchini (University of

16

Calgary) (NMTs: ERAQI(K/Q)DG; CYP719s: FxxGxxxCxG, PxIGN). Putative NMTs

17

identified from the library were aligned with published sequences and candidate sequences and

18

phylogenetic trees were generated and a subset was manually selected for testing. Putative

19

CYP719s, published CYP719s, and CYP719s deposited online on GenBank and the Cytochrome

20

P450 Homepage69 were grouped by BLASTclust into groups with 55% sequence identity at the

21

amino acid level of over 95% of the sequence. These groups were then aligned and phylogenetic

22

trees were generated in the same manner as for NMTs, and a subset was manually chosen for

23

testing. All alignments were performed with MUSCLE and phylogenetic trees were built by the

16

ACS Paragon Plus Environment

Page 17 of 38

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

1

neighbor-joining method using the program MEGA6.035 with a bootstrap value of 1000. When

2

indicated, branches with less than 80% confidence values were condensed to build condensed

3

phylogenetic trees. Enzymes selected for study were codon-optimized for expression in yeast and

4

synthesized by Gen9 (Cambridge, MA).

5 6

Construction of plasmids

7

All cloning was performed via yeast homologous recombination46 using regions of

8

homology added to DNA during PCR (see primer list, Supporting Table S2). PCR of DNA to be

9

cloned was performed with Phusion polymerase. When appropriate, E. coli was cultivated in LB

10

medium at 37°C with shaking at 200 rpm with supplementation of 100 µg/mL ampicillin as

11

necessary. A series of vectors designated as pBOT (Supporting Figure S1) were designed to

12

facilitate gene expression and enzyme activity assays. Each pBOT vector has a unique

13

combination of yeast selection marker, promoter, and terminator (Supporting Table S3). Details

14

on the construction of the pBOT vectors is outlined in Supporting Materials and Methods. To

15

switch selectable markers of Ring B-closing CYP719s, the promoter-gene-terminator cassette

16

from pBOT-Trp was introduced to pBOT-Leu by digesting both vectors with NotI/AscI, gel

17

purifying the pBOT-Trp insert and the pBOT-Leu vector, and ligating both fragments.

18

Dihydrosanguinarine pathway genes other than those purchased in this study were cloned

19

into either pGREG or pYES vectors (Supporting Table S3), where they could be used for activity

20

assays and/or genomic integration. Promoters, genes, and terminators introduced into pGREG or

21

pYES vectors were amplified with overlapping homology regions (Supporting Table S2) and

22

cloned by homologous recombination. Heterologous DNA (linkers C1, C6, H1, and H2) were

23

added as previously described21 (indicated in bold in Supporting Tables S2, S3 and S4).

17

ACS Paragon Plus Environment

ACS Synthetic Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 18 of 38

1 2

Chromosomal integration of genes and multi-gene pathways

3

To facilitate activity assays and to build a dihydrosanguinarine synthesis strain, some

4

genes and multi-gene pathways were integrated into the genome of S. cerevisiae into sites

5

previously determined to allow high levels of gene expression70 (sites and genes are indicated in

6

Supporting Table S4). Strains built to facilitate activity assays (strains GCY1333, GCY1270, and

7

GCY1317) were built using homologous recombination and selected with antibiotic resistance to

8

200 µg/L geneticin and/or hygromycin using the antibiotic markers kanMX and hphNT1,

9

respectively.71-72 The dihydrosanguinarine synthesis strain (strain GCY1440) was built using

10

homologous recombination and CRISPR-Cas9 (Supporting Table S4). Regions of DNA (~500

11

bp) upstream and downstream of the integration site (UP and DOWN regions, respectively) were

12

amplified with homology to heterologous DNA to guide gene integration. Dihydrosanguinarine

13

synthesis genes used to build strains GCY133, GCY1270 and GCY1317 were excised from

14

pGREG vectors using AscI/NotI and gel purified, while dihydrosanguinarine synthesis genes

15

used to build strain GCY1440 were amplified from plasmids and heterologous linkers were

16

added (LV3, LTP1, LTP2, LV5). When applicable, Cas9 was directed to the 5´ and 3´ ends of

17

integration sites using two guide RNAs (gRNAs). The 20-bp targeting regions of gRNAs were

18

introduced through splice overlap extension using the primers indicated in Supporting Table S2.

19

Both gRNAs, along with linearized vector containing Cas9 (pCAS-Tyr)73 were co-transformed

20

into S. cerevisiae. Successful gene integration by both methods was confirmed by PCR. All

21

primers used are listed in Supporting Table S2.

22 23

BIA culture supplementation assays

18

ACS Paragon Plus Environment

Page 19 of 38

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

1

Activity of CYP719s and NMTs on BIAs was analyzed using culture substrate

2

supplementation assays. Yeast cultures were grown in yeast nitrogen base with 2% glucose and

3

amino acid dropout media as appropriate (YNB) at 30°C and 200 rpm. Yeast cells harboring

4

enzymes of interest were inoculated in triplicate into 100 µL of media in 96-well 2 mL deep-well

5

plates and incubated overnight. The following day, 900 µL of fresh media was added (1:10

6

dilution) and cultures were incubated for an additional 6 hrs. Cells were pelleted by

7

centrifugation for 5 min at 3,200 g and supernatants were aspirated. Cell pellets were suspended

8

in 300 µL TE (10 mM Tris, 1 mM EDTA, pH 8) containing BIAs as appropriate, at a

9

concentration of 5 µM unless otherwise specified, and incubated overnight at 30°C with shaking.

10

The following day, cells were pelleted at 3,200 g. Supernatants were transferred to 96-well

11

microtiter plates, diluted 1:1 in 100% methanol and clarified at 3,200 g prior to analysis by LC-

12

MS. To extract BIAs from cells, pellets were suspended in 300 µL methanol, vortexed at 1,000

13

rpm and 4°C for 30 min and centrifuged at 3,200 g. The resulting extracts were analyzed by LC-

14

MS as described below. To stay in the linear range of the LC-MS, samples supplemented with

15

>10 µM of a BIA were diluted before analysis. (S)-Scoulerine and (S)-stylopine were purchased

16

from ChromaDex (Irvine, CA, USA) and (R,S)-norlaudanosoline was purchased from Enamine

17

Ltd. (Kiev, Ukraine). Dihydrosanguinarine was prepared from sanguinarine by NaBH4

18

reduction74.

19 20

Liquid chromatography-mass spectrometry

21

Using a Perkin Elmer SERIES 200 Micropump, 5 µL of samples were injected onto an

22

Agilent Zorbax Rapid Resolution HT C18 2.1*30mm, 1.8 micron column and analytes were

23

separated using reverse-phase HLPC using the following gradient: Solvent A, 0.1% formic acid;

19

ACS Paragon Plus Environment

ACS Synthetic Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

1

Solvent B, 100% acetonitrile, 0.1% formic acid; 0-1 min 95% A, 1-8 min 5 to 100% B (linear

2

gradient), 8-9 min 100% B; 9-9.1 min 95% A, followed by a 2 min equilibration at 95% A.

3

HPLC-grade methanol and acetonitrile were purchased from Fischer Scientific, and HPLC-grade

4

water and formic acid were purchased from Fluka. Sample elution was followed by injection into

5

the 7T-LTQ FT ICR mass spectrometer (Thermo Scientific) under the following conditions:

6

resolution, 50000 at 400 m/z; scanning range, 150-500 AMU; source voltage, 4.9 kV; source

7

temperature, 380°C; AGC target for full mass spectrum was set to 1 x 106 ions. Retention time,

8

exact mass (