Sequence alignment of cfALDH with the human ALDH
Sequence alignment of cfALDH with the human ALDH1 indicates that the cyanobacterial protein contains the same GQCC motif as the human ALDH1 and ALDH2 proteins  (Fig. 1d, highlighted in red, and Supplementary Fig. 2). This motif is present in ALDH1/2 orthologues and these residues reside at the bottom of the substrate entry channel (SEC). The second cysteine is involved in the nucleophilic attack of the carbonyl carbon in the aldehyde (Fig. 1c).
To ascertain the activity of the protein, cfALDH was cloned from genomic DNA and over-expressed in E. coli. SDS-PAGE analysis showed the purified protein has a MW of ∼50 kDa (Fig. 2a), with analysis by accurate mass spectrometry giving a mass of 53613.60 Da whilst the calculated mass is 53614.19 Da (Supplementary Fig. 3). The protein purified as a homo-tetramer on size-exclusion chromatography (calculated mass of ∼239 kDa, Fig. 2b and calibration in Supplementary Fig. 4).
To establish whether retinoic CCG-1423 is produced by this putative retinal dehydrogenase, an LC-MS based method was used. The purified protein was incubated in the presence of NAD and all-trans-retinal to determine the conversion to retinoic acid. Samples were extracted into hexane and analysed by LC-MS. After incubation at 37 °C for 90 min retinoic acid was clearly produced when all components were present, but not without inclusion of the cofactor NAD (Fig. 2c and d, standards shown in Supplementary Fig. 5).
A fluorescence based assay was then used to measure NADH production and therefore calculate the levels of retinoic acid production. The concentration of all-trans-retinal was varied between 0.1 and 10 μM whilst NAD+ remained at a high concentration of 1 mM, whilst cfALDH was kept low at 200 nM (Fig. 3a). The concentration of cfALDH was varied and a relationship between activity and enzyme concentration is seen as expected (Fig. 3b). This process was optimal at a basic pH of 8.5–9.5 (Fig. 3c) and enhanced by the presence of Mg2+ in the assay buffer (Fig. 3d), as seen with other ALDH1 [, , , , ]. (Supplementary Fig. 6). This is the first time a cyanobacterial ALDH has been seen to produce retinoic acid from all-trans-retinal.
Previous work identified signature residues found in either ALDH1 or ALDH2 that potentially enable the prediction of whether an ALDH is able to process large or small aldehydes . When we compare the cfALDH sequence to these signatures, we find that the protein contains 8/34 signature residues of the sheep ALDH1A1 with which it has a 64% sequence identity (NP_001009778.1), 17/34 of human ALDH2 with which it has a 68% sequence identity (AAP36614.1) and 9/34 that match neither ALDH1A1/2 (Fig. 4a and Fig. S6). These residues were then mapped onto the predicted model of cfALDH, generated using Phyre2 v.2.0 , where they are present across the surface and core of cfALDH (Fig. 4c).
We then measured the estimated size of the substrate entry channel (SEC) using CastP 3.0 . This enabled us to assess whether the cyanobacterial ALDH was likely to be more characteristic of an ALDH1 or ALDH2 considering it appears to be equally related to these proteins from our phylogenetic analysis (Fig. 1b). The size of the SEC in an ALDH has previously been suggested as a way of predicting the substrate [24,33].
The SEC in cfALDH is predicted to have a volume between that seen in the sheep ALDH1A1 (Volume 670.2 Å3) and the human ALDH2 (Volume 558.07 Å3) with a calculated volume of 635.05 Å3 when the sheep ALDH1A1 1BXS is used as the template (Fig. 4b). The GQCC motif is at the bottom of the SEC in cfALDH as expected. This suggests that the cyanobacterial (and planctomyces) ALDH could resemble the evolutionary branching point ahead of the evolution of animal ALDH1/2 that were selective for large or small aldehydes, respectively.
Discussion We have identified and characterised for the first time a cyanobacterial ALDH that can convert all-trans-retinal to retinoic acid. Blooms producing retinoic acid are proving toxic to local flora and fauna, so any further insight into the mechanism of this production is crucial. This is a potentially important step in elucidation of the biosynthesis of retinoic acid in cyanobacteria, though further experimentation is required to determine if this enzyme is in fact employed for this purpose in this cyanobacterium.