A bioinformatic and computational study of myosin phosphatase subunit diversity



Similar documents
Introduction to Genome Annotation

RETRIEVING SEQUENCE INFORMATION. Nucleotide sequence databases. Database search. Sequence alignment and comparison

Biological Sciences Initiative. Human Genome

Systematic discovery of regulatory motifs in human promoters and 30 UTRs by comparison of several mammals

Just the Facts: A Basic Introduction to the Science Underlying NCBI Resources

Genomes and SNPs in Malaria and Sickle Cell Anemia

Bioinformatics Resources at a Glance

Guide for Bioinformatics Project Module 3

Outline. MicroRNA Bioinformatics. microrna biogenesis. short non-coding RNAs not considered in this lecture. ! Introduction

A Primer of Genome Science THIRD

Current Motif Discovery Tools and their Limitations

Searching Nucleotide Databases

Human Genome Organization: An Update. Genome Organization: An Update

GMQL Functional Comparison with BEDTools and BEDOPS

The world of non-coding RNA. Espen Enerly

When you install Mascot, it includes a copy of the Swiss-Prot protein database. However, it is almost certain that you and your colleagues will want

Algorithms in Computational Biology (236522) spring 2007 Lecture #1

The Making of the Fittest: Evolving Switches, Evolving Bodies

Chapter 18 Regulation of Gene Expression

How To Understand How Gene Expression Is Regulated

Genetic information (DNA) determines structure of proteins DNA RNA proteins cell structure enzymes control cell chemistry ( metabolism )

RegulomeDB scores and functional assignments of 153 SCARB1

Human-Mouse Synteny in Functional Genomics Experiment

Introduction to Bioinformatics 3. DNA editing and contig assembly

PROC. CAIRO INTERNATIONAL BIOMEDICAL ENGINEERING CONFERENCE

Module 3 Questions. 7. Chemotaxis is an example of signal transduction. Explain, with the use of diagrams.

DNA Replication & Protein Synthesis. This isn t a baaaaaaaddd chapter!!!

Comparing Methods for Identifying Transcription Factor Target Genes

Gene Models & Bed format: What they represent.

mrna EDITING Watson et al., BIOLOGIA MOLECOLARE DEL GENE, Zanichelli editore S.p.A. Copyright 2005

Final Project Report

Introduction to transcriptome analysis using High Throughput Sequencing technologies (HTS)

FlipFlop: Fast Lasso-based Isoform Prediction as a Flow Problem

Genetomic Promototypes

Kinexus has an in-house inventory of lysates prepared from 16 human cancer cell lines that have been selected to represent a diversity of tissues,

13.4 Gene Regulation and Expression

Yale Pseudogene Analysis as part of GENCODE Project

KOD ÜRÜN AMBALAJ MARKA MAROSEL TIBBİ ÜRÜNLER (0216)

BIOINF 525 Winter 2016 Foundations of Bioinformatics and Systems Biology

Bioinformatics Grid - Enabled Tools For Biologists.

Chapter 5: Organization and Expression of Immunoglobulin Genes

Introduction to Bioinformatics AS Laboratory Assignment 6

Genome Viewing. Module 2. Using Genome Browsers to View Annotation of the Human Genome

Choices, choices, choices... Which sequence database? Which modifications? What mass tolerance?

RNA and Protein Synthesis

國 立 交 通 大 學. 同 源 蛋 白 質 - 蛋 白 質 交 互 作 用 之 研 究 A Study of Homologous Protein-protein Interactions 生 物 科 技 學 系 博 士 論 文

Analysis and Integration of Big Data from Next-Generation Genomics, Epigenomics, and Transcriptomics

Frequently Asked Questions Next Generation Sequencing

Exercises for the UCSC Genome Browser Introduction

CRAC: An integrated approach to analyse RNA-seq reads Additional File 3 Results on simulated RNA-seq data.

Control of Gene Expression

An Overview of Cells and Cell Research

INTERNATIONAL CONFERENCE ON HARMONISATION OF TECHNICAL REQUIREMENTS FOR REGISTRATION OF PHARMACEUTICALS FOR HUMAN USE Q5B

AP Biology 2013 Free-Response Questions

Note: This document wh_informatics_practical.doc and supporting materials can be downloaded at

SICKLE CELL ANEMIA & THE HEMOGLOBIN GENE TEACHER S GUIDE

1 Mutation and Genetic Change

Gene Switches A Model

Human Genome and Human Genome Project. Louxin Zhang

BIOINFORMATICS TUTORIAL

Activity 7.21 Transcription factors

GENE REGULATION. Teacher Packet

Lecture 1 MODULE 3 GENE EXPRESSION AND REGULATION OF GENE EXPRESSION. Professor Bharat Patel Office: Science 2, b.patel@griffith.edu.

Introduction to Phylogenetic Analysis

org.rn.eg.db December 16, 2015 org.rn.egaccnum is an R object that contains mappings between Entrez Gene identifiers and GenBank accession numbers.

Control of Gene Expression

MORPHEUS. Prediction of Transcription Factors Binding Sites based on Position Weight Matrix.

Linear Sequence Analysis. 3-D Structure Analysis

Next Generation Sequencing: Technology, Mapping, and Analysis

Special report. Chronic Lymphocytic Leukemia (CLL) Genomic Biology 3020 April 20, 2006


Histone modifications. and ChIP. G. Valle - Università di Padova

Nebula A web-server for advanced ChIP-seq data analysis. Tutorial. by Valentina BOEVA

Chapter-21b: Hormones and Receptors

MASCOT Search Results Interpretation

Unraveling protein networks with Power Graph Analysis

How many of you have checked out the web site on protein-dna interactions?

BIO 3350: ELEMENTS OF BIOINFORMATICS PARTIALLY ONLINE SYLLABUS

GenBank, Entrez, & FASTA

Introduction To Real Time Quantitative PCR (qpcr)

Standards, Guidelines and Best Practices for RNA-Seq V1.0 (June 2011) The ENCODE Consortium

Lecture Series 7. From DNA to Protein. Genotype to Phenotype. Reading Assignments. A. Genes and the Synthesis of Polypeptides

Sample Questions for Exam 3

Module 3. Genome Browsing. Using Web Browsers to View Genome Annota4on. Kers4n Howe Wellcome Trust Sanger Ins4tute zfish-

Complex multicellular organisms are produced by cells that switch genes on and off during development.

Outline. interfering RNA - What is dat? Brief history of RNA interference. What does it do? How does it work?

Transcription and Translation of DNA

Identification of rheumatoid arthritis and osteoarthritis patients by transcriptome-based rule set generation

岑 祥 股 份 有 限 公 司 技 術 專 員 費 軫 尹

Module 1. Sequence Formats and Retrieval. Charles Steward

Bioinformatics: Network Analysis

Lecture Outline. Introduction to Databases. Introduction. Data Formats Sample databases How to text search databases. Shifra Ben-Dor Irit Orr

Transcription:

m J Physiol Regul Integr Comp Physiol 307: R56 R70, 04. First published June 4, 04; doi:0.5/ajpregu.0045.04. bioinformatic and computational study of myosin phosphatase subunit diversity Rachael P. Dippold and Steven. Fisher Department of Medicine, Cardiology, University of Maryland Baltimore, Baltimore, Maryland Submitted 0 pril 04; accepted in final form 5 May 04 Dippold RP, Fisher S. bioinformatic and computational study of myosin phosphatase subunit diversity. m J Physiol Regul Integr Comp Physiol 307: R56 R70, 04. First published June 4, 04; doi:0.5/ajpregu.0045.04. Variability in myosin phosphatase (MP) subunits may provide specificity in signaling pathways that regulate muscle tone. We utilized public databases and computational algorithms to investigate the phylogenetic diversity of MP regulatory (PPPR-C) and inhibitory (PPPR4-D) subunits. The comparison of exonic coding sequences and expression data confirmed or refuted the existence of isoforms and their tissue-specific expression in different model organisms. The comparison of intronic and exonic sequences identified potential expressional regulatory elements. s examples, smooth muscle MP regulatory subunit (PPPR) is highly conserved through evolution. Its alternative exon E4 is present in fish through mammals with two invariant features: ) a reading frame shift generating a premature termination codon and ) a hexanucleotide sequence adjacent to the 3= splice site hypothesized to be a novel suppressor of exon splicing. characteristic of the striated muscle MP regulatory subunit (PPPRB) locus is numerous and phylogenetically variable transcriptional start sites. In fish this locus only codes for the small (M) subunit, suggesting the primordial function of this gene. Inhibitory subunits show little intragenic variability; their diversity is thought to have arisen by expansion and tissue-specific expression of different gene family members. We demonstrate differences in the regulatory landscape between smooth muscle enriched (PPPR4) and more ubiquitously expressed (PPPR4B) family members and identify deeply conserved intronic sequence and predicted transcriptional cis-regulatory elements. This bioinformatic and computational study has uncovered a number of attributes of MP subunits that supports selection of ideal model organisms and testing of hypotheses regarding their physiological significance and regulated expression. myosin phosphatase; Mypt; Mypt; CPI-7 THE SEQUENCING of the human genome (44, 04) opened vast new insights into the variability and diversity encoded within the genome. Before this, predictions for the number of human protein-coding genes ranged between 35,000 (4) and upwards of 50,000 (4). The count of protein-coding genes in the most recent build of the human genome (GRCh37.p3, 009) is 0,805, a number that is on par with that of the simple organism Caenorhabditis elegans, 0,53 (WBcel35, 0). This has led to the premise that much of the complexity of the human transcriptome is derived from the creation of multiple products from a single gene, such that humans have 69,000 different proteins, whereas in comparison the worm has 5,000, according to current database curations (Ensembl, UniProt). Multiple transcripts are generated from a single gene by alternative usage of exons through alternative splicing or ddress for reprint requests and other correspondence: S.. Fisher, S-0 HSFII, 0 Penn St., Baltimore, MD 0 (e-mail: Sfisher@medicine. umaryland.edu). alternative transcriptional start sites, each of which are nearly universal features of multiexon genes in mammals and higher vertebrates (67, 79, 08). n example is the human tropomyosin (TPM) gene in which the combination of alternative splicing and multiple transcriptional start sites results in 8 distinct protein coding transcripts (reviewed in Ref. 3). In humans, the DN of protein-coding genes accounts for less than 3% of the total genome (0); in the past the remaining 97% of the genome was considered junk DN. However, recent evidence suggests that 80% of the genome is biochemically active: either transcribed (protein coding, noncoding, and pseudogenes), chromatin-associated, or regulatory (0). These active regulatory regions may be intra- or intergenic and contain cis elements that regulate gene transcription or removal of introns (splicing of exons) to convert pre-mrn into mature protein coding mrn. Phylogenetic conservation of noncoding DN sequences has been successfully used to identify transcriptional regulatory elements such as enhancers (reviewed in Ref. 37), though there is not a simple relationship in either direction between sequence conservation and regulatory elements (reviewed in Refs. 7 and 3). Similarly, it has been found that intronic regions flanking alternative exons, and the cis regulatory elements within, are under selection pressure and often exhibit broad conservation (67). Tissue specificity and tissue gene expression signatures are also frequently conserved, particularly among mammals and higher vertebrates, such that the transcriptome of a given tissue is more similar across species than it is to other cell types within the same species (3, 6, 67). The evolutionary divergence of multiple closely related family members from a single ancestral gene creates diversity in higher organisms through specific expression, either tissue specific or during development, and also through specificity of interactions and substrates. The differing strategies of cell conservation and diversification are illustrated in the protein kinases and phosphatases that regulate approximately one-third of all eukaryotic proteins. pproximately 400 serine-threonine kinases are present in the mammalian genome with a steady increase in the number throughout evolution of eukaryotes. In contrast, only 5 corresponding phosphatases are present with little evolutionary change in this number (87). This reflects differing strategies of diversification, as diversity in the activity of Type phosphatases is generated by a large number and variability within associated regulatory subunits and signaling pathways controlling their activity (9). In smooth muscle the myosin light chain kinase (MLCK) and phosphatase (MP) are primary determinants of the state of contraction of the muscle and thereby contribute to the regulation of blood flow and pressure and other vital organ functions (reviewed in Refs. 34, 38, and 40). MLCK and MP enzymes are also present in striated muscle where their func- R56 0363-69/4 Copyright 04 the merican Physiological Society http://www.ajpregu.org

tion is less understood but are thought to play a more modulatory role in contractile function (reviewed in Refs. 47 and 94), and in nonmuscle cells where they regulate motility and cytokinesis (reviewed in Ref. 65). The MP holoenzyme is composed of three subunits: the protein phosphatase (PP) catalytic subunit, the MP targeting/regulatory subunit (MYPT), and the small subunit (M) (reviewed in Ref. 45). fourth subunit, CPI-7, is an inhibitory subunit that when phosphorylated inhi MP activity (reviewed in Ref. ). Using traditional methods, we and others have made some progress in determining how variability in the MP regulatory subunits in humans and animal models may provide cellspecific functions. For example, smooth muscle, like striated muscle, may be functionally dichotomized into fast (phasic) versus slow (tonic) contractile phenotypes. Each uses MCLK and MP for activation and dactivation of force, yet the force outputs and how they are regulated are very different (reviewed in Refs. 7 and 00). Within the Mypt regulatory subunit an alternative exon (exon 4) that is included in phasic smooth muscle and skipped in tonic smooth muscle is thought to determine regulation of MP activity by cgk [nitric oxide (NO) signaling pathway] (5, 83, 0; reviewed in Ref. 5). Similarly, variable expression of an inhibitory subunit of MP, CPI-7 (PPPR4), is proposed to determine tissue-specific regulation of MP activity by -adrenergic signaling (53, 7; reviewed in Ref. ). We hypothesized that through interrogation of publicly available databases we could ) uncover much more about the phylogenetic conservation, variability, and tissue-specific expression of MP subunit isoforms; and ) discover potential transcriptional and splicing cis-regulatory elements that may control the variability in the expression of MP subunits in muscle tissues, regarding which little is currently understood (reviewed in Refs. 3, 46, 78, and 07). This informatics approach provides new insights into the relationship between MP diversity and muscle diversity and predictions regarding how MP diversity may be generated, providing a foundation for further experimentation and selection of appropriate model organisms. METHODS Protein Domain We used several protein databases to identify conserved protein domains in fly and worm: InterPro (43), PNTHER (0), Pfam (89), and SUPFM (80). Conservation and lignment Conservation of the Mypt alternative exon was determined with the PhastCons conservation track in the UCSC Genome Table Browser (genome.ucsc.edu) (49, 50). ll genomic and transcript data were gathered from Ensembl (Release 7, June 03). Nucleic acid sequence alignment was performed using Clustal Omega (http:// www.ebi.ac.uk/tools/msa/clustalo/) (98). JalView () was used to calculate pairwise sequence identity and to perform alignment editing. Splice Site and cis-regulatory Element Predictions Mypt E4 splice sites were predicted using Human Splicing Finder v.4. (www.umd.be/hsf/) (3) and the lternative Splice Site Predictor (SSP; wangcomputing.com/assp) (09), using human and chicken Mypt E4 as prediction positive controls. Human Splicing Finder identification of splice sites utilizes a positional weight matrix (PWM) of consensus splice signals (3) to identify potential splice sites that conform to consensus splice signals above threshold. The SSP prediction algorithm was developed using sequences for known constitutive, alternative, and cryptic splice sites to take into account nonconsensus splice sites (09). Conserved motifs were identified using Gapped Local lignment of Motifs (GLM; meme.nbcr.net) (8). Predicted cis-splicing regulatory elements (SREs) were then identified using Human Splicing Finder v.4. (www.umd.be/hsf/) (3). Human Splicing Finder reports predicted SREs from several experimental and computationalbased prediction sets: ESE Finder (7, 99), RESCUE ESE (5), and other published predictions (33, 0, 9, ). Of the experimentally derived SRE prediction sets, exonic splicing silencer (ESS) sequences were obtained by testing random decanucleotide sequences in splicing reporters (0). SREs from ESE Finder and Human Splicing Finder were predicted using consensus motif PWMs derived from SELEX (systematic evolution of ligands by exponential enrichment) experiments (7, 3). The other SRE prediction sets were computationally derived through different parameters: predicted exonic splicing enhancers (ESEs) based on hexamer enrichment in weak exons [RESCUE ESE, (5)]; predicted exonic splicing regulatory sequences (ESRs) based on enrichment of conserved (human-mouse) hexamers in human-mouse orthologous exons (33); predicted exonic SREs based on enrichment of octamers in constitutive, noncoding exons (); predicted SREs as asymmetrically enriched in introns (intronic identity elements: IIEs) or exons (exonic identity elements: EIEs) (9). Expression Data ffymetrix exon array data (88) and RN-Seq tissue expression data (85) were obtained from publicly available sources (genome.ucsc.edu; http://www.ebi.ac.uk/gxa). Promoter and Enhancer nalysis Transcriptional start site predictions for human genes were assembled from the SwitchGear Genomics Transcriptional Start Site track in the UCSC Genome Table Browser (genome.ucsc.edu) (49, 50). SwitchGear TSS predictions are based on human GenBank cdns. UCSC Genome Browser (human, hg9) ENCODE tracks for H3K4Me, H3K4Me3, and H3K7c from 7 human cell lines (9), ENCODE DNase hypersensitivity clusters from 5 human cell types (0, 03), and ENCODE transcription factor ChIP were used to determine regions of transcriptional activity. Regions conserved between human and mouse were aligned using the ECR Browser (www.dcode.org) (77). Conserved and aligned transcription factor binding sites (TFBS) were identified using the NCBI dcode package including MultiTF and rvista.0 (www.dcode.org) (6, 76). The rvista program utilizes the comprehensive database of TFBS motifs TRNSFC Pro V0. (4) and scores sequence similarity of TFBS to TRNSFC PWM. We used two cutoffs for TFBS identification: ) the rvista optimized for function method that independently optimizes each TFBS to limit the density of an individual TFBS to 3 sites per 0 kb of random sequence (76) and ) 0.85 fixed cutoff score for sequence similarity to TRNSFC PWMs. RESULTS R57 Phylogenetic Conservation of MP Subunit Families Catalytic subunit. The PP family (PPPC, PPPCB, PPPCC) is highly conserved with extensive sequence identity between paralogs ( 85%) and among orthologs in other species (Fig. ; reviewed in Refs. 9 and 0) (70). Orthologs of this family can be found in the fruit fly, worm, yeast, and plants (58), indicating ancient origins for the PP catalytic subunit. JP-Regul Integr Comp Physiol doi:0.5/ajpregu.0045.04 www.ajpregu.org

R58 The central majority of the PP proteins, including the catalytic and binding domains, have nearly identical amino acid sequences. The modest amount of sequence diversity among the paralogs is found in the distal COOH- and NH -terminal ends of the proteins. Pairwise analysis of the amino acid and nucleotide sequences indicate that the PPPC and PPPCC paralogs are more similar to each other than to PPPCB, the gene that codes for the catalytic subunit of MP (also known as PP or PP ). This is true for the fly as well, suggesting that PPPCB 0.5 Ray-finned fish (including zebrafish) pppca, Xenopus Ciona sea squirts gsp-, Caenorhabditis elegans Ray-finned fish (including zebrafish) Reptiles and birds (including chicken and lizard) (including mouse) C (including primates and rat) PPPCC, Zebra Finch Birds and turtles PPPCC, Platypus PPPCC, Duck pppcc, Xenopus PPPCC, Coelacanth ENSLCG00000008556, Coelacanth Birds PPPC, Platypus (including primates and rodents) PPPC, nole lizard PPPC, Chinese softshell turtle PPPC, Coelacanth Fruitfly pppcc, Lamprey Ray-finned fish PPPC Birds PPPCB, Tarsier Ray-finned fish (including zebrafish) Bony vertebrates (including rat) (including humans) PPPCB, Chicken mniotes (including mouse and lizard) PPPCB, Coelacanth pppcbl, Zebrafish Bony fish Birds and turtles Lamprey Vertebrates Vertebrates ENSCING00000005405, C.intestinalis rthropods and nematodes Bilateral animals (including fly and worm) GLC7, Saccharomyces cerevisiae B B C 0.5 0.5 Reptiles and birds (including lizard) (including primates and rodents) PPPRC, Wallaby PPPRC, Platypus ppprc, Xenopus PPPRC, Coelacanth Bony vertebrates (including zebrafish) pppr4c, Xenopus PPPR4C, Coelacanth Lamprey PPPR PPPR4 Ray-finned fishes (including zebrafish) Marsupials (including primates and rodents) mniotes (including chicken and lizard) (including primates and rodents) PPPR4B, Chinese softshell turtle pppr4b, Xenopus PPPR4B, Coelacanth Bony vertebrates (including zebrafish) Ciona sea squirts Ray-finned fish (including zebrafish) Reptiles and birds (including chicken and lizard) (including primates and rodents) pppra, Xenopus PPPR, Coelacanth pppra, Spotted gar pppra, Lamprey Reptiles and birds (including chicken and lizard) Marsupials (including primates and rodents) B PPPRB, Platypus PPPRB, Coelacanth Ray-finned fishes (including zebrafish) PPPRB, Lamprey Ray-finned fishes (including zebrafish) Reptiles and birds (including chicken and lizard) Marsupials (including rodents) Pppr4d, Kangaroo rat PPPR4D, lpaca PPPR4D, Sloth PPPR4D, Xenopus Bony fish pppr4a, Xenopus Ray-finned fish Mbs, Fruitfly (including primates and rodents) Marsupials PPPR4, nole lizard Bony vertebrates (including zebrafish) pes C C B D JP-Regul Integr Comp Physiol doi:0.5/ajpregu.0045.04 www.ajpregu.org

originated before vertebrates and invertebrates diverged (58). PP (PPPCC) appears to be a newer isoform, having diverged from PPPC, and is less conserved among orthologs than are the PP (PPPC) and PP (PPPCB). ll of the paralogs are ubiquitously expressed. Thus diversity in the serine-threonine type phosphatases depends on a multitude of PP regulatory subunits (reviewed in Ref. 0). The regulatory subunits of the MP catalytic subunit, Mypt and CPI-7, have greater evolutionary divergence within their families and greater variability among their orthologs than the catalytic subunit (Fig. ). Regulatory subunit. The Mypt (PPPR) regulatory subunit family is part of a very large super family of ankyrin repeat domain proteins composed of two subtrees. long with the PPPR family, the first subtree contains the PPPR6 (-B) family, also known as Mypt3 (PPPR6) and TGF- -inhibited membrane-associated protein (TIMP; PPPR6B). The second subtree within this super family contains PPPR3B, which codes for apoptosis-stimulating protein of p53 (SPP), PPPR3L, which codes for inhibitor of apoptosis stimulating protein of p53 (ISPP) protein, and the tumor protein p53 binding protein (TP53BP). These subfamily members are sometimes classified as Mypts; given their evolutionary distance and lack of cardinal features of Mypt family members (described below) we consider them to be distinct subfamilies. The myosin phosphatase regulatory subunit family of proteins is composed of Mypt, Mypt, and MBS85 (PPPR-C, respectively). The PPPR family is well conserved and there is an orthologous Mypt gene in both the fly (Mbs, Drosophila melanogaster) and worm (Mel-, C. elegans) (68, 6). Each of the PPPR family members contains an RVxF motif (PP binding site) and 7 or 8 ankyrin repeat domains toward the NH -terminus, or conserved Thr phosphorylation sites, and a leucine zipper motif at the COOH-terminus (reviewed in Ref. 34). Whereas Mypt family members (paralogs) overall have 39 6% amino acid identity, the described conserved domains have much higher sequence identities (50 90%) (reviewed in Ref. 45). These characteristic domains of the Mypt family hold true for the fly and worm orthologs as well (68, 6). The COOH-terminus leucine zipper motif is highly conserved among species and is similar in sequence between family members (88%) (34, 45). BLST searches of the genome with any of the three PPPR family LZ motifs match only to the other members of the family, suggesting that these specific LZ motif sequences are specific to the PPPR Mypt family. Inhibitory subunit. Sequence similarity between inhibitory subunit family member paralogs and orthologs is much less R59 than for the catalytic subunit family (Fig. C). However, within the phosphatase inhibitory domain there is 4% sequence similarity (reviewed in Ref. ). Importantly, the Thr38 phosphorylation site of PPPR4 (CPI-7), which regulates its inhibitory activity, and the surrounding residues, which are necessary for phosphorylation of Thr38, are conserved in the other three family members (PPPR4B-D). Orthologs of PPPR4 are found in lower species, CG74 in the fly and F55C0.5 in the worm (64, 95) (flybase.org; wormbase.org), and were not previously identified (). The Thr38 phosphorylation site is conserved in the fly and worm orthologs, while the flanking sequences contain hydrophobic and basic residues but vary from the vertebrate consensus sequence (basic-hydrophobic-thr-hydrophobic-basic). Interestingly, the chicken genome lacks the PPPR4 gene, and it has been suggested that chicken smooth muscle MP is inhibited by the CPI-7 paralog PHI (PPPR4B) (7, 8, 54). However, PPPR4B is also not found in the completed sequence of the chicken genome (Genome assembly Galgal4, Ensembl Genebuild updated Dec. 03). nother member of the PPPR4 family, e.g., the more closely related kinase enhanced phosphatase inhibitor (KEPI) (PPPR4C), may function as the MP inhibitory subunit in chicken smooth muscle. lternative Splicing of Regulatory Subunits and MP Diversity Mypt. The mammalian PPPR (Mypt) gene consists of 6 exons, 3 of which are alternatively spliced (E3, E4, and E4). lternative splicing in the central region of Mypt is conserved, though the specific alternative exons vary (are not orthologous): E in the chicken (6, 96), E9 and E in the worm (5) (wormbase.org), and the functional significance of these variants is unknown. lternative splicing of Mypt E4, a 3 nt exon, determines the coding for, or lack of, a COOHterminal LZ motif in the Mypt protein (Fig. D). The LZ motif of Mypt is thought to be required for LZ-mediated heterodimerization with cgk - and cgmp-dependent activation of MP (4, 5, 0) (57). The expression of Mypt E4 and central splice variant isoforms has been studied extensively during development (5, 83, 97) and in disease models (6, 9, 35, 6, 84, 97, 0) (reviewed in Refs. 5 and 7). Mypt E4 is located in a region of high conservation spanning approximately 0 bp upstream through 00 bp downstream of E4 (Fig. ) (97). E4 is partially conserved in fish genomes (tetraodon, fugu, stickleback, medaka, and zebrafish), though, notably, a homologous sequence is absent Fig.. Phylogenetic trees of myosin phosphatase subunit families. : phylogenetic tree of the protein phosphatase (PP) catalytic subunit family demonstrates high conservation amongst orthologs and paralogs. The node containing the human myosin phosphatase (MP) catalytic subunit, PPPCB (PP ), is shown in red, whereas paralogs PPPC (PP ) and PPPCC (PP ) are in blue. B: phylogenetic tree of the MP targeting subunit family is given with the node containing the human smooth muscle MP subunit, protein phosphatase regulatory subunit (PPPR) (MP targeting subunit ; Mypt) shown in red and paralogs PPPRB (Mypt) and PPPRC (myosin binding subunit 85; MBS85) shown in blue. C: phylogenetic tree of the MP inhibitory subunit family (PP regulatory subunit 4; PPPR4) demonstrates greater variability between paralogs and orthologs compared with the catalytic subunit in. The node containing the human MP inhibitory subunit PPPR4 (c-kinase potentiated protein phosphatase inhibitor 7; CPI-7) is shown in red, whereas the paralogs PPPR4B (phosphatase holoenzyme inhibitor; PHI), PPPR4C (kinase enhanced phosphatase inhibitor; KEPI), and PPPR4D (gastric brain phosphatase inhibitor; GBPI) are shown in blue. In the trees the red boxes represents gene duplication and the dark blue boxes speciation events. Light blue boxes are ambiguous nodes. Branch lengths along the horizontal axis correspond to the expected number of changes per nucleotide site in the DN, as indicated in the corresponding scale bars, which is proportional to evolutionary divergence. The length (horizontal) of solid branches and nodes (triangles) is distance, whereas dashed branches and striped nodes are 0 distance. The width (vertical) of nodes is proportional to the number of orthologs included in the node. Phylogenetic trees were developed by Ensembl, Release 75 (February 04) (05). JP-Regul Integr Comp Physiol doi:0.5/ajpregu.0045.04 www.ajpregu.org

7 9 5 9 7 7 9 R60 LZ- human mouse chicken lizard tetraodon fugu zebrafish hg9 chr, -: Mypt E4 80,7,900 80,73,000 80,73,00 80,73,00 80,73,300 Vertebrate Cons B human mouse chicken lizard tetraodon fugu medaka stickleback zebrafish C 0 GG 3T 4 5 6C T G T 3 4T 5G 6 GT 0 C 6C G G 8GGC CT 0T T 3C 4T 5G 6 7 8 9G 0 SRSF SRSF6 3 splice signal 5 splice signal predicted T 3G 4G 7T 8T D E4-out E4-in TTC 8C CT 0 SRSF6 T G 3C 4T 5G 6C 8CTG T 9G 0C T LZ+ human mouse chicken lizard tetraodon fugu zebrafish shift in the reading frame resulting in a Mypt subunit that lacks the COOH-terminal LZ motif (Fig. D, right). The LZ motif is highly conserved in the annotated PPPR protein product in tetraodon, fugu, and zebrafish (Fig. D, left). The shift in the reading frame also results in a premature termination codon (PTC) in the fish Mypt, a feature that is also conserved throughout phylogeny (Fig. D, right), suggesting functional importance. However, the function and significance of the Mypt COOH-terminus alternative (LZ ) amino acid sequence and PTC, respectively, are not known at this time. The intronic sequence immediately flanking the alternative exon is well conserved (Fig. B). The nonconsensus 5= splice site sequence (guagua) and the lack of an upstream polypy- * * ** * * Fig.. Phylogenetic conservation of isoforms of Mypt generated by alternative splicing of Exon 4. : conservation of the myosin targeting subunit (Mypt) alternative exon E4 and the flanking intronic region is shown by the phastcons track on the UCSC Genome Browser in the human hg9/grch37 genome release. The phastcons track is a multiple alignment of 46 vertebrate species ( Vertebrate Cons ) and a subset of 33 placental mammal ( ) that estimates the probability of individual nucleotides belonging to a conserved element, by considering both the individual alignment column and its flanking columns. The higher the green bars the more likely the region belongs to a conserved element. B: alignment and coloring based on percent identity of Mypt E4 in human, mouse, chicken, and lizard, and the predicted alternative exon in fish demonstrates regions of high conservation immediately flanking the exon. Red lines attached to triangles highlight known and predicted (in fish) splice sites. Conserved sequences were identified and analyzed for splicing cis-regulatory elements as described in METHODS. C: exon-intron structure of the 3= end of the Mypt gene is shown. lternative splicing of the 3 nt E4 changes the reading frame. mino acid sequence alignment of the Mypt COOH-terminus for the E4-out/leucine zipper (LZ) and the E4-in/LZ isoforms, demonstrates phylogenetic conservation, with LZ more conserved than LZ. The leucines of the LZ motif for the E4-out isoform (left) are highlighted in gray. The amino acids that are coded by the alternative exon E4 are highlighted in blue. in the frog, lamprey, and C. elegans orthologs. Sequence alignment of the alternative exon and 50 bp flanking regions demonstrates that this region is highly conserved in mammals, the chicken, and lizard; in fish there is 67 73% sequence identify of E4 to mammals, whereas the flanking sequence is less well conserved than in the higher vertebrates (Fig. B). The 3= and 5= splice sites are computationally predicted in the fish flanking the sequence that is homologous to mammalian Mypt E4 alternative exon, resulting in a 34 (most fish) or 37 nt (zebrafish) predicted exon (Fig. B). Using RT-PCR, we confirmed that this is indeed a tissue-specific alternative exon in the zebrafish smooth muscle (data not shown). s in higher vertebrates the fish alternative exon also causes a -nt JP-Regul Integr Comp Physiol doi:0.5/ajpregu.0045.04 www.ajpregu.org

rimidine tract are both characteristic of a weakly spliced (alternative) exon (reviewed in Ref. ). Two upstream and one downstream blocks of intronic sequence are highly conserved from fish through higher vertebrates suggesting they could function in the regulation of exon splicing (Fig. B). nalysis of these sequences using splicing regulatory elements (SRE) prediction algorithms for known cis-splicing regulatory elements (see METHODS) reveals conserved sites for splicing factors SRSF (SRp30b) and SRSF6 (SRp55) in the intronic regions immediately flanking E4 (Fig. C). We also identified computationally predicted SREs of unknown function (5, 9) within the highly conserved sequence immediately adjacent to the 3= splice site: ctgaaa (human-lizard)/ctgaag (fish) and tgaaag (human-lizard)/tgaagg (fish) (Fig. C). The proximity of the identified elements to the 3= splice site suggests that they may function as splicing repressors by blocking recruitment of U splicing factor to this site (see DISCUSSION). Of note the E4 sequence itself is highly conserved in higher vertebrates but less well conserved in fish. number of cis-regulators of splicing located within higher vertebrates E4 (97) are not present within the fish E4 sequence. How this may affect the regulation of E4 splicing is considered further in DISCUSSION. number of other predicted conserved cis-regulatory splicing elements are identified both within the alternative exon and flanking introns (Supplemental Table S). Mypt. lternative splicing of exon 4 (numbering based on human gene) of Mypt (PPPRB) gene product has similarities and differences with that of Mypt E4 but has been much less studied. The Mypt E4 skipped isoform codes for a highly conserved COOH-terminal LZ motif (Fig. 3C) that is nearly identical in amino acid sequence to the COOH-terminal LZ sequence of family members Mypt and p85. In contrast to Mypt E4, Mypt E4 inclusion codes for an alternative COOH-terminal LZ sequence and contains the PTC (Fig. 3). Thus MBS85 (PPPRC) is the only Mypt family member with an invariant COOH-terminal (LZ) sequence. Like Mypt E4, splicing of Mypt E4 is highly restricted (described in detail under Complexity of the Mypt Locus). Like Mypt E4, Mypt E4 sequence is highly conserved in the genomes of mammals (94% sequence identity to mouse), chicken (76%), and lizard (74%) and also absent in frog. There is 58 63% conservation of the coding portion of the exon of the human Mypt E4 in fish (medaka, stickleback, tetraodon) (Fig. 3B). Unlike Mypt, there is a polypyrimidine-rich tract upstream of Mypt E4 and high 3= splice site consensus conformity (85 9%) in all of the species examined, giving a robust 3= splice site prediction conforming to the known mammalian splice site (Fig. 3B, red line). The stop codon is also in alignment, though it is a TG in fish as opposed to a TG in mammals, chicken, and lizard (Fig. 3B, red asterisk). There is minimal homology in the exonic sequence immediately downstream of the PTC, and the downstream intronic flanking region is not conserved. The annotated mammalian 5= splice site (Fig. 3B, red line) was not predicted by either the Human Splicing Finder (3) or the lternative Splice Site Predictor (09). Neither program predicted a 5= splice site for the other species investigated either, suggesting a very weak 5= splice site for Mypt E4. This along with other features described below, likely accounts for the extremely tissuerestricted and phylogenetically limited splicing of this exon. lternatively, given the internal PTC in Mypt E4, it is R6 conceivable that it could function as a terminal exon, in which case there would be no need for the 5= splice site, though there is no data to support this scenario at this time. The COOH-terminal LZ motif coded for by skipping of Mypt E4 is highly conserved through fish (Fig. 3C, left). The alternative LZ motif encoded by E4 is 48 5% conserved from human to fish and retains three of the four leucines of the alternative LZ motif (Fig. 3C, right). Interestingly, a distinct 67 nt alternative exon has been identified in the chicken Mypt (63) located in the same intron and nearly 4.5 kb upstream of the sequence with homology to mammalian E4. This chicken Mypt alternative E4 has a PTC in the fourth codon, resulting in a COOH-terminal LZ variant (63). We could not identify with confidence sequences homologous to the distinct chicken Mypt E4 in the other species investigated. The chicken genomic sequence homologous to mammalian Mypt E4 (Fig. 3B) has not been demonstrated to be a functional exon in chicken, though it could be that the proper tissues have not yet been examined, e.g., chicken skeletal muscle. Sequence that is conserved between human and fish within the coding portion of the sequence of the Mypt E4 contains predicted binding sites for splicing factors 9G8 and Tra (Fig. 3B) and an hnrnp site which may act as a splicing silencer (). Immediately upstream of the PTC is a conserved predicted exonic identity element: GGGC (human-lizard)/ GGCC (fish) (Fig. 3B, Supplemental Table S). In contrast to Mypt E4, there is generally a lack of conserved SREs in the intronic flanking regions of Mypt E4: upstream of the 3= splice site is the pyrimidine-rich tract that is conserved as a feature but not at the level of individual nucleotides, while downstream of the 5= splice site there is a lack of conservation (Fig. 3B, Supplemental Table S). n exception is the intronic region immediately flanking and including the 3= splice site, which is highly conserved and contains a predicted binding site for SRSF6 (SRp55) that spans the 3= splice site (Fig. 3B) and may inhibit recruitment of the U splicing factor. dditional conserved SREs were identified (Supplemental Table S). The absence of conserved SREs near the 5= splice site and the nonconsensus 5= splice site itself is consistent with default skipping of the Mypt E4. Complexity of the Mypt Locus Mypt, hsm, smm. PPPRB is a highly polymorphic gene locus where a number of unique transcripts are generated by alternative splicing of exons (described above) and alternative transcriptional start sites (TSS). Unique TSS generate first exons (transcripts) unique to skeletal (Mypt), cardiac (hsm) and smooth (smm) muscle (Fig. 4) (,, 30). Each of the annotated first exons of human Mypt, hsm, and smm are associated with indicators of transcriptional activity (H3K4Me, H3K4Me3, and H3K7c), DNase hypersensitivity, and transcription factor binding as well as with TSS predictions (Fig. 4, B, C, E, red). This suggests a relatively unusual situation in which three loci within a single gene are under separate transcriptional control by the three muscle types. Interestingly, the TSS and first exon of hsm appears to differ among mammals. The first exon of human hsm is conserved in some species (e.g., rhesus, dog, elephant have 96 83% identity) but is JP-Regul Integr Comp Physiol doi:0.5/ajpregu.0045.04 www.ajpregu.org

3 4 5 6 3 4 3 5 7 5 6 7 8 5 9 R6 hg9 Mypt E4 chr, +: 0,544,000 0,544,00 0,544,00 0,544,300 0,544,400 Vertebrate Cons B human mouse chicken lizard tetraodon medaka stickleback human mouse chicken lizard tetraodon medaka stickleback 0 CT C 4GCT G TC SRSF6 6T 8GC 9 0G G 3 splice signal predicted 0 7GGC GCT 0 9G8 9GGTGTCG C T 3C G 4C 6TG 7G 8C C GT 0G 0 9G8/hnRNP 9G8 Trab 3G 4 TC 5 6 7G G 8 9 0GC C E4-out * E4-in human mouse chicken lizard tetraodon medaka stickleback human mouse chicken lizard tetraodon medaka stickleback Fig. 3. Phylogenetic conservation of isoforms of Mypt generated by alternative splicing of Exon 4. : conservation of the myosin phosphatase targeting subunit (Mypt) alternative exon and flanking region is shown by the phastcons track on the UCSC Genome Browser in the human hg9/grch37 genome release, as described in Fig.. B: mammalian (human, mouse) Mypt E4 and flanking regions are aligned to the homologous sequences in chicken, lizard, and fish PPPRB. Red lines and triangles highlight the known (mammalian) and predicted splice sites for E4. The red asterisk denotes the known (mammalian) and aligned stop codon. Conserved motifs were identified and analyzed for splicing cis-regulatory elements (see METHODS). C: diagrammed gene structure of the 3= end of the Mypt gene indicates alternative splicing of E4 and the change of the open reading frame, in black. mino acid sequence alignment of the COOH-terminus of Mypt demonstrates high conservation of the E4-out leucine zipper (LZ) motif (left). The E4-in LZ motif (right) is more variable in fish. Leucine residues of the LZ motifs are highlighted in gray. missing completely in the mouse and rat (Fig. 4C). Conversely, the sequence of the annotated first exon of mouse hsm is conserved in humans (87.5% identity) and is located kb downstream of the human hsm first exon (Fig. 4, gray box). There is extensive conservation ( 80%) between the human and mouse in the sequence immediately upstream (400 bases) of the sequence for the mouse hsm first exon. However, in the human this region lacks a predicted TSS based on human GenBank cdns (49, 50). There is also a lack of H3 modifications, DNase hypersensitivity, and TF binding associated with transcriptional activity (Fig. 4D), which, in total, is consistent with different transcriptional start sites and first exons between mouse and human hsm with the caveat that these data were obtained from human cell lines. The mouse hsm first exon is highly conserved in the rat (95.4%), but only the 3= half of the exon is conserved in the other mammals (rhesus, dog, elephant, and opossum). Neither the human nor mouse hsm first exon sequence is present in chicken or lower species raising the question of whether hsm is generated in these species. JP-Regul Integr Comp Physiol doi:0.5/ajpregu.0045.04 www.ajpregu.org

R63 TSS Mypt TSS hsm TSS smm / Fish PPPRB TG Mypt TG hsm TG smm TG E4-in mouse hsm E 3... 3 4 6 8 9 3 4 5 (B) (C) (D) (E) TG E4-out B Mypt E H3K4Me H3K4Me3 H3K7c DNase Clusters TF ChIP 0,35,000 5kb 0,35,000 D mouse hsm E H3K4Me H3K4Me3 H3K7c DNase Clusters TF ChIP 0,440,000 5kb 0,450,000 Vert Cons C hsm E H3K4Me H3K4Me3 H3K7c DNase Clusters TF ChIP 0,45,000 5kb 0,435,000 Vert Cons E smm E H3K4Me H3K4Me3 H3K7c DNase Clusters TF ChIP 5kb 0,500,000 0,55,000 Vert Cons Rhesus Mouse Rat Dog Elephant Opossum Chicken Lizard Vert Cons Fig. 4. Complexity of the Mypt (PPPRB) locus. : protein phosphatase regulatory subunit B (PPPRB) locus hosts independent transcriptional start sites (TSS) for Mypt (98 aa), heart-specific M (hsm) (08 aa), and smooth muscle M (smm) (86 aa). The first exons and start codons (TG) for each transcript are highlighted and color coded in this diagram that uses the human PPPRB as the template (red: Mypt; blue: hsm; green: smm). The alternate stop codons (TG) are also diagrammed. The region corresponding to the first exon of the mouse hsm is depicted by a gray box. Experimental evidence for independent transcriptional regulation (regions of histone 3 modification, DNase hypersensitivity, TF ChIP) is shown for the first exons of human Mypt(B) hsm (C), and smm (E). SwitchGear predicted TSSs are depicted as red lines. The mammalian and vertebrate conservation is also shown. D: experimental evidence for transcriptional regulation of the first exon of mouse hsm and conservation are shown. In contrast the TSS and unique first exon of smm (), located between exons 8 and 9 of Mypt are highly conserved in humans, chickens, and fishes (Fig. 4, and E). Interestingly, the PPPRB ortholog in fish (tetraodon, medaka, and stickleback) appears to generate only the smm transcript; no genomic sequence with homology to mammalian Mypt spanning exons 8 is identified in the fish. This suggests that the original PPPRB gene may have only coded for the small M subunit, and that the NH -terminal ankyrin repeats and PPc binding domains were later acquired through a recombination with another Mypt gene, most likely PPPRC given its closer phylogenetic relationship (Fig. B). lternatively, the 5= end of the PPPRB gene could have been lost during fish evolution. Whereas the cause of this variability in the PPPRB gene structure during evolution is not defined, the variability itself is consistent with the difficulty in defining the function of MP (Mypt) in striated muscle. The complexity of the locus is compounded by the tissuespecific expression of these independent transcripts. Mypt is transcribed in the skeletal muscle, heart, and, to a lesser extent, brain (30); hsm, as indicated by the name, has tissue-specific expression in cardiac muscle (); and smm is expressed in smooth muscle (69, 88). dditionally, splicing of PPPRB E4 is highly regulated (, 88). Inclusion of mammalian E4 in Mypt is highly restricted to skeletal muscle (unspecified type) in the context of the full-length Mypt transcript, with about half of the total transcript levels composed of the alternative isoform (). dditionally, both the E4-in and E4-out isoforms of hsm were cloned from human cardiac tissue (), though there is no report of their relative proportions. Inclusion of the distinct avian PPPRB 67 nt E4 occurs in the context of smm and is also tissue specific, representing the primary isoform in the fast (phasic) gizzard smooth muscle (5, 63). In contrast inclusion of E4 (8 or 67 nt) in mammalian smooth JP-Regul Integr Comp Physiol doi:0.5/ajpregu.0045.04 www.ajpregu.org

R64 muscle smm transcripts is not identified in published studies (63) nor public gene expression datasets. More thorough investigations of the pattern of splicing of Mypt E4 are required to define its tissue specificity, potential coupling to gene transcription, and phylogenetic conservation. MP Inhibitory Subunits Four family members of the inhibitory subunit of myosin phosphatase (PPPR4-D) have been identified with little variation within each transcript (). Exon of CPI-7 (PPPR4) has been reported as alternative in human aorta (8), but our surveys did not uncover an exon skipped variant in a survey of rat smooth muscle tissue (data not shown) and there are no other reports of this variant in the databases. lternative translational start sites have been proposed for human and mouse PHI (,) (PPPR4B) (), but using the current mouse and human genome sequence the proposed PHI- UG sequence 67 nt and 54 nt upstream of the well-validated start codons of mouse and human PHI-, respectively, ) lack Kozak consensus sequence for translational initiation and ) would change the reading frame and cause premature termination after 4 amino acids, casting doubt on this as a bona fide translational variant. The diversity in the inhibitory subunits arises by their regulated transcription, yet the expressional and functional relationships between the inhibitory subunit family members are not well defined. The initial reports of CPI-7 (PPPR4), the first identified family member, demonstrated high levels of expression specific to smooth muscle (3). This finding is confirmed in public databases of human and mouse RN-Seq and microarray tissue surveys (85, 88), with high levels of CPI-7 transcript in the aorta, bladder, lung, and prostate and much lower levels in striated muscle and nonmuscle cells, mirroring Mypt expression. In contrast, PHI (PPPR4B) expression is fairly ubiquitous with published immunoblot evidence of protein expression () and ubiquitous expression in RN-Seq data in general agreement (85, 88). The expression of the other inhibitory subunit family members KEPI (PPPR4C) and gastric brain phosphatase inhibitor (GBPI) (PPPR4D) is less pervasive and less robust. Published studies have characterized KEPI expression as most robust in cardiac tissue (60) while GBPI expression is primarily limited to the colon (59). There is evidence both for (8) and against (8) redundancy between the two main inhibitory subunits expressed in smooth muscle, PPPR4 and B, yet there are no mouse knockout experiments to define the roles of the subunits. Our unpublished data from Deep RN sequencing indicates that CPI-7 (PPPR4) transcripts are approximately two- to ninefold higher than PHI- in vascular smooth muscle (unpublished data, RP Dippold and S Fisher). Consistent with restricted expression and specialized function of PPPR4, the region upstream of the PPPR4 TSS lacks TT and CT boxes, has little phylogenetic conservation, and lacks indicators of transcriptional activity in cell lines (Fig. 5). limited region of 88 bases 300 500 bases upstream of the TSS could be aligned between human and mouse and putative TFBS identified, including a GC box that was previously identified and shown to function as a minimal promoter in vitro [(5) and Supplemental Table S3]. We used a number of criteria in an attempt to computationally identify potential PPPR4 (CPI-7) transcriptional enhancers (see METHODS) (36, ). Regions within the first intron immediately upstream of exon and within the third intron are well conserved suggesting possible regulatory functions (Fig. 5). These regions were hypersensitive to DNse and contained histone modifications consistent with enhancer activity. lgorithms predicted a number of binding motifs for TFs involved in smooth muscle phenotypic determination (reviewed in Ref. 55), including Enhancer box (E-box), cmp response element-binding protein (CREB), nuclear factor of activated T-cells (NFT), and peroxisome proliferator-activated receptor- (PPR ) but notably no CRG motifs for the binding of serum response factor (SRF) and myocardin, a well-validated smooth muscle cis-regulatory motif (reviewed in Ref. 8). The absence of indicators of transcriptional activity near the TSS likely reflects the tissue-specific transcription of PPPR4 and its low expression in the various cell lines in which these assays are performed. In contrast to PPPR4, PPPR4B appears to have considerably more transcriptional activity in cell lines as is suggested by the histone 3 modifications, DNase hypersensitivity, and TF ChIP (Fig. 5B), likely indicative of its un-restricted expression. The intronic and upstream regions of PPPR4B are more highly conserved, allowing for a more extensive report of conserved TFBS. Potential conserved regulatory elements include a muscle-specific TT (mtt) box (gctggccctttgggg)-srf combination 44 bp upstream of the TSS, an P site within 0 bp of the SRF site, multiple SP sites, and an site in the conserved region proximal to the TSS (Fig. 5B). The first intron of PPPR4B contains many conserved, high-scoring putative TFBS (Fig. 5, B, Supplemental Table S4). s with the upstream promoter region, several of the putative TFBS in the first intron are involved with muscle-specific gene expression, muscle differentiation, homeostasis, and growth such as, NFT, CREB, and Forkhead box protein O (FOXO) (reviewed in Ref. 5). This computational prediction of transcriptional control of the MP inhibitory subunits provides a foundation for experimental testing of their functional importance in different cell types. DISCUSSION The control of muscle function by protein phosphorylation reflecting the regulated activities of Ser-Thr Type phosphatases and kinases is pervasive and also muscle type specific. Yet there remains limited understanding of the diversity within the components of the signaling pathway and its effect on the control of muscle function. Here we focused on the myosin phosphatase and analyzed public databases to define modes of diversity and the regulation of the variability of the MP subunits. Considerable diversity is present within its regulatory subunits, which reflects evolutionary genomic diversification as a whole and includes increases in the number of regulatory (inhibitory) gene family members multiplied by greatly increasing combinations of alternative transcriptional start sites and exon splicing that vastly increase the number of unique transcripts generated from each gene locus. The completion of sequencing of many genomes facilitated a phylogenetic analysis of Mypt E4 splice variants. The skipping of Mypt E4, the evolutionary and tissue default, JP-Regul Integr Comp Physiol doi:0.5/ajpregu.0045.04 www.ajpregu.org

R65 chr9: 38,74,000 kb 38,750,000 PPPR4 H3K4Me H3K4Me3 H3K7c DNase Clusters TF ChIP B chr: PPPR4B H3K4Me H3K4Me3 H3K7c DNase Clusters TF ChIP 64,0,000 kb 64,07,000 Vert Cons 38,743,735 SMD P4 LF MZF P MEIS SP 38,744,00 EGR TEF UFH3b CREB ETF Stat T3R SP Vert Cons 64,04,343 ETS MTT P TFII CCD KROX E GC NMYC P ETS EF 64,05,645 ETS NFT CCT 38,74,430 COUP TF EBF NFMEU HEB CREB LBP Musc Ini GT4 MF NERF ETS P300 GC P P Stat SMD KROX WT SP SP Stat SMD T3R EBF CP 00 bp 38,74,995 E3 SP GC ETF SRF SP(x) SP GC NKX.5 HR KROX YY EGR SP P NFY 00 bp 64,0,700 64,03,93 GT6 NFT T3R CREB P4 P P4 ETS SP EGR SF EF NFT MF P53 ER SP TBX5 SRY SP T3R EF CREB E4F HNF4 PPRG HNF4 E4F ETS NFT NKX.5 IRF PE3 TT TFIII MZ MZF 00 bp E SP SF NFkB NRF ETS R ELK P4 ETS P SMD3 EF SP(x) TEF SOX VDR NKX.5 PE3 SP P P CREB T3R P CREB P Fig. 5. nalysis of CPI-7 (PPPR4) and PHI- (PPPR4B) noncoding sequence for potential transcriptional regulatory activity. : protein phosphatase regulatory subunit 4 (PPPR4). Portions of noncoding sequence in introns and 3 (red boxes and ) are well conserved between human and mouse and have other indicators suggestive of transcriptional enhancer activity (H3 modification, DNase hypersensitivity, and TF ChIP). The transcription factor binding sites (TFBS) predicted in each of these blocks of sequence are shown. The TFBS in black are predicted using the rvista optimized for function approach to reduce false positives (see METHODS). The TFBS in gray have a singular cutoff of 0.85 similarity to the TRNSFC positional weighted matrix (PWM) and are of interest to muscle gene regulation. B: PPPR4B. noncoding sequence is well conserved immediately upstream of the TSS and in intron (red boxes and ) and have other features suggestive of transcriptional promoter and enhancer activity, respectively. The TFBS predicted in each of these blocks of sequence are shown. The full compilation of TFBS for the two cutoffs can be found in Supplemental Tables S3 and S4. 00 bp E codes for a COOH-terminal LZ motif, a highly conserved feature of Mypt family members (Mypt, Mypt, and MBS85; reviewed in Ref. 34), consistent with its proposed role in LZ-mediated heterodimerization with cgk and NO/cGMPdependent activation of MP (4, 5, 0) (57). The LZ, in which a leucine (or isoleucine) residue is present at every seventh amino acid in the context of an a-helical coiled-coil domain, was originally described in what is now termed the B-ZIP family of transcription factors (56). In this large family, high throughput and computational studies suggest that specificity in LZ-mediated hetero- and homodimerizations produce great diversity and specificity in the transcriptional output (90, 06). There is some evidence for similar specificity and diversity in LZ-mediated interactions in the regulation of myosin phosphatase and other contractile proteins controlling muscle function (reviewed in Refs. 5 and 39). Mypt and Mypt have alternative COOH-termini generated by alternative splicing of E4 that abolishes or creates an alternative LZ motif, respectively. The Mypt E4 sequence is highly conserved as an alternative exon from fish to mammals, but not present in the ancestral ortholog in flies and worms, and is also absent from some vertebrate classes such as amphibians. This suggests that this alternative exon emerged during evolution and was under strong selection pressure to be retained. We propose that it arose as a mechanism to suppress NO/cGMP regulation of MP activity in phasic smooth muscle tissues. number of studies have shown that NO or its second messenger cgmp are unable to activate MP as a means for relaxation of prototypical phasic smooth muscle such as the rat portal vein (6, 83) and chicken gizzard (5, 86). MP expression and activity is severalfold higher in phasic versus tonic smooth muscle (3), all supporting the hypothesis that higher basal, yet unregulated MP activity, at least with respect to NO/cGMP signaling, is required for cycling of phasic smooth muscle contraction and relaxation. This hypothesis could be tested by deletion of E4 in the mouse converting all smooth muscle tissues to the E4 (LZ ) isoform. There has been less study of the expression pattern of the Mypt E4 splice variants coding for COOH-terminal LZ variants. The few studies that have examined this describe the JP-Regul Integr Comp Physiol doi:0.5/ajpregu.0045.04 www.ajpregu.org

R66 sample as skeletal muscle, leaving open the possibility that the expression of the variants may vary by striated muscle type (fast, slow), species, or developmental age. Interestingly, the fish Mypt homologue only codes for the small (M) subunit, suggesting that either ) MP activity is not required for striated muscle function in fish or ) this function is served by another Mypt family member. LZ-mediated dimerization partners of each Mypt LZ variant have not been determined, though given the strong similarity of the E4 encoded LZ to Mypt (and MBS85) it seems likely that it would also bind PKG. Both E4 and E4 -encoded Mypt LZ motifs are slightly basic and have similar charge profiles. However, the E4 encoded LZ motif is uniquely followed by a COOH-terminal acidic tail of residues that could modulate interactions with regulatory proteins. Entirely missing from the models of LZ motifs and MP protein interactions is consideration of the small (M) subunit of MP, which has yet to be well characterized in terms of tissue-specific expression of isoforms and function. Finally, it is worth noting that the upstream regulatory kinase PKG also has two evolutionarily conserved isoforms containing alternative NH -terminal LZ motifs (, ) generated from alternative transcriptional start sites (75). The individual LZ motifs of the two isoforms are thought to create specificity in their substrates though there remains limited data in this regard (reviewed in Refs. 8 and 39). The current study describing the tissue-specific and phylogenetic patterns of expression of the MP LZ containing subunits provides a foundation for experimental testing of the role of PKG activation of MP in muscle-type specific control of function in appropriate model organisms. Portions of the intronic sequence flanking Mypt E4 are also highly conserved, consistent with a prior genome-wide evolutionary study of alternative splicing that found evolutionary conservation of alternative exons was associated with conservation of flanking intronic regulatory sequences (67). side from the splice site junction sequences, the most invariant sequence is the hexanucleotide 5=-TCTG-3= located just upstream of the G of the 3= splice site. This sequence was computationally predicted to be an exonic splicing enhancer and exonic identity element (5, 9). Exonic elements of this nature are thought to be necessary for proper exon identification and splicing (4). However, when found in intronic regions flanking exons, they can act as splicing repressors (48, 66, 9). The proximity of this element to the E4 3= splice site would predict that it would suppress splicing of the exon by blocking recruitment of U to this site. Its function as a cis-regulator of E4 splicing remains to be tested, which given its conservation could be performed in any model system and most expeditiously in the zebrafish. The high conservation of Mypt E4 sequence among higher vertebrates (mammals, birds, reptiles: 6/3 nt identity) is less present in the fish. The exonic cis-element (GCGGU) that binds the splicing factor Tra and enhances splicing of Mypt E4 (9, 97) is absent. Whether this results in less efficient tissue-specific splicing of E4 in fish or reflects different control mechanisms requires further study. Predicted binding sites for other classic Ser/rg-rich (SR) splicing factors [SRSF (SRp30b) and SRSF6 (SRp55) (9)] are also highly evolutionarily conserved and could be involved in the regulated splicing of this exon. Overall, there remains limited understanding of the regulation of alternative splicing in the generation of smooth muscle phenotypic diversity and somewhat more understanding of this topic in striated muscle (reviewed in Refs. 7, 3, and 46). The highly programmed and tissue-specific nature of Mypt and Mypt E4 splicing make them good candidates for model exon approaches to this problem. The four PPPR4 family members have much less complex gene structures with little variability within individual genes, not surprising given their small size and few (4 5) exons. There is a single and well-validated report of a splice variant of PPPR4 (CPI-7) in human aortic tissue in which skipping of exon deletes a segment of the PP inhibitory domain (8). However, alternative splicing of this highly conserved exon in other species is not present in the queried databases such that this variation may be unique to humans. Rather, the diversity in the inhibitory subunit seems to derive from the variability among the family members and the highly regulated and tissue-specific expression of these genes, though it should be acknowledged that the function and relationships between the family members in vivo have not yet been defined through gene knock-outs. Expression of CPI-7 mrn and protein is highly tissue-specific, being greatly enriched in smooth muscle with some variation between muscle types (7), and dynamically altered in disease (4, 7, 73, 74, 93). Little is known about its transcriptional regulation in these contexts. n upstream minimal promoter was identified (5), but there is little conservation of this upstream region between species, and no CT or TT boxes, all consistent with a highly regulated transcript. We found highly conserved noncoding sequence in introns and 3 that are predicted to contain a number of transcriptional cis-regulatory elements. Some transcription factors that bind these predicted elements, such as NFT and PPR (5, 9), mediate the slow muscle gene program and regulate responses to external signals, as do CREB, Stat, SMD, and Ets (reviewed in Ref. 5). Definition of these conserved and predicted transcriptional enhancer domains will require functional testing; not all regulatory sequences are conserved, and conservation of noncoding and putative regulatory sequence provides an increased likelihood but not certainty that they would function as such. In contrast PHI has the features of a less tightly regulated gene, with a highly conserved upstream sequence that in cell lines has histone marks and DNse hypersensitivity indicative of an active promoter. Perspectives and Signficance The phosphorylation and dephosphorylation of myosin is the primary means by which force is controlled in smooth muscle and is thought to provide a modulatory role in striated muscle. In this study we have used computational analyses of large publicly available databases to describe the diversity in MP subunits. This more comprehensive analysis has revealed a number of features of the MP subunits that are predicted to underlie the functional significance and expressional regulation of this diversity. Of particular note are ) the deeply conserved alternative splicing of Mypt E4, putative upstream splicing regulatory element, and PTC, ) the substantial phylogenetic variability at the Mypt (PPPRB) locus with a variety of tissue-specific transcripts and alternative splicing that matches its uncertain role in striated muscle function, and 3) the phylogenetic expansion of the inhibitory subunit (PPPR4-D) gene family with little intragenic diversity, and difference in structure of JP-Regul Integr Comp Physiol doi:0.5/ajpregu.0045.04 www.ajpregu.org

putative promoter-enhancers between the highly (CPI-7) and less highly (PHI-) regulated family members. This comprehensive phylogenetic analysis of MP subunit diversity will enable the optimal selection of model organisms for testing hypotheses as to the regulation and function of subunit isoforms in determining specificity in the control of signaling pathways that regulate muscle function. GRNTS This work was supported by National Institutes of Health Grants HL-667 to S.. Fisher and T3 R00759 to R. P. Dippold. DISCLOSURES No conflicts of interest, financial or otherwise, are declared by the author(s). UTHOR CONTRIBUTIONS uthor contributions: R.P.D. and S..F. conception and design of research; R.P.D. performed experiments; R.P.D. and S..F. analyzed data; R.P.D. and S..F. interpreted results of experiments; R.P.D. prepared figures; R.P.D. drafted manuscript; R.P.D. and S..F. edited and revised manuscript; R.P.D. and S..F. approved final version of manuscript. REFERENCES. rimura T, Suematsu N, Zhou YB, Nishimura J, Satoh S, Takeshita, Kanaide H, Kimura. Identification, characterization, and functional analysis of heart-specific myosin light chain phosphatase small subunit. J Biol Chem 76: 6073 608, 00.. st G. How did alternative splicing evolve? Nat Rev Genet 5: 773 78, 004. 3. Barbosa-Morais NL, Irimia M, Pan Q, Xiong HY, Gueroussov S, Lee LJ, Slobodeniuc V, Kutter C, Watt S, Colak R, Kim T, Misquitta-li CM, Wilson MD, Kim PM, Odom DT, Frey BJ, Blencowe BJ. The evolutionary landscape of alternative splicing in vertebrate species. Science 338: 587 593, 0. 4. Bourgeois CF, Popielarz M, Hildwein G, Stevenin J. Identification of a bidirectional splicing enhancer: differential involvement of SR proteins in 5= or 3= splice site activation. Mol Cell Biol 9: 7347 7356, 999. 5. Braun T, Gautel M. Transcriptional mechanisms regulating skeletal muscle differentiation, growth and homeostasis. Nature Rev : 349 36, 0. 6. Brawand D, Soumillon M, Necsulea, Julien P, Csardi G, Harrigan P, Weier M, Liechti, ximu-petri, Kircher M, lbert FW, Zeller U, Khaitovich P, Grutzner F, Bergmann S, Nielsen R, Paabo S, Kaessmann H. The evolution of gene expression levels in mammalian organs. Nature 478: 343 348, 0. 7. Cartegni L, Wang J, Zhu Z, Zhang MQ, Krainer R. ESE finder: web resource to identify exonic splicing enhancers. Nucleic cids Res 3: 3568 357, 003. 8. Casteel DE, Smith-Nguyen EV, Sankaran B, Roh SH, Pilz RB, Kim C. crystal structure of the cyclic GMP-dependent protein kinase I dimerization/docking domain reveals molecular details of isoform-specific anchoring. J Biol Chem 85: 3684 3688, 00. 9. Ceulemans H, Stalmans W, Bollen M. Regulator-driven functional diversification of protein phosphatase- in eukaryotic evolution. Bioessays 4: 37 38, 00. 0. Ceulemans H, Bollen M. Functional diversity of protein phosphatase-, a cellular economizer and reset button. Physiol Rev 84: 39, 004.. Chen YH, Chen MX, lessi D, Campbell DG, Shanahan C, Cohen P, Cohen PT. Molecular cloning of cdn encoding the 0 kda and kda regulatory subunits of smooth muscle protein phosphatase. FEBS Lett 356: 5 55, 994.. Del Gatto-Konczak F, Olive M, Gesnel MC, Breathnach R. hnrnp recruited to an exon in vivo can function as an exon splicing silencer. Mol Cell Biol 9: 5 60, 999. 3. Desmet FO, Hamroun D, Lalande M, Collod-Beroud G, Claustres M, Beroud C. Human Splicing Finder: an online bioinformatics tool to predict splicing signals. Nucleic cids Res 37: e67, 009. 4. Dickson D. Gene estimate rises as US and UK discuss freedom of access. Nature 40: 3, 999. R67 5. Dippold RP, Fisher S. Myosin phosphatase isoforms as determinants of smooth muscle contractile function and calcium sensitivity of force production. Microcirculation : 39 48, 03. 6. Dirksen WP, Vladic F, Fisher S. myosin phosphatase targeting subunit isoform transition defines a smooth muscle developmental phenotypic switch. m J Physiol Cell Physiol 78: C589 C600, 000. 7. El-Touhky, Given M, Cochard, Brozovich FV. PHI- induced enhancement of myosin phosphorylation in chicken smooth muscle. FEBS Lett 579: 47 477, 005. 8. El-Toukhy, Given M, Ogut O, Brozovich FV. PHI- interacts with the catalytic subunit of myosin light chain phosphatase to produce a Ca( ) independent increase in MLC(0) phosphorylation and force in avian smooth muscle. FEBS Lett 580: 5779 5784, 006. 9. ENCODE Project Consortium. user s guide to the encyclopedia of DN elements (ENCODE). PLoS Biol 9: e00046, 0. 0. ENCODE Project Consortium, Bernstein BE, Birney E, Dunham I, Green ED, Gunter C, Snyder M. n integrated encyclopedia of DN elements in the human genome. Nature 489: 57 74, 0.. Eto M. Regulation of cellular protein phosphatase- (PP) by phosphorylation of the CPI-7 family, C-kinase-activated PP inhibitors. J Biol Chem 84: 3573 3577, 009.. Eto M, Karginov, Brautigan DL. novel phosphoprotein inhibitor of protein type- phosphatase holoenzymes. Biochemistry 38: 695 6957, 999. 3. Eto M, Senba S, Morita F, Yazawa M. Molecular cloning of a novel phosphorylation-dependent inhibitory protein of protein phosphatase- (CPI7) in smooth muscle: its specific localization in smooth muscle. FEBS 40: 356 360, 997. 4. Ewing B, Green P. nalysis of expressed sequence tags indicates 35,000 human genes. Nat Genet 5: 3 34, 000. 5. Fairbrother WG, Yeh RF, Sharp P, Burge CB. Predictive identification of exonic splicing enhancers in human genes. Science 97: 007 03, 00. 6. Feletou M, Hoeffner U, Vanhoutte PM. Endothelium-dependent relaxing factors do not affect the smooth muscle of portal vein. Blood Vessels 6: 3, 989. 7. Fisher S. Vascular smooth muscle phenotypic diversity and function. Physiol Genomics 4: 69 87, 00. 8. Frith MC, Saunders NF, Kobe B, Bailey TL. Discovering sequence motifs with arbitrary insertions and deletions. PLoS Comput Biol 4: e00007, 008. 9. Fu K, Mende Y, Bhetwal BP, Baker S, Perrino B, Wirth B, Fisher S. Trabeta is required for tissue-specific splicing of a smooth muscle myosin phosphatase targeting subunit alternative exon. J Biol Chem 87: 6575 6585, 0. 30. Fujioka M, Takahashi N, Odai H, raki S, Ichikawa K, Feng J, Nakamura M, Kaibuchi K, Hartshorne DJ, Nakano T, Ito M. new isoform of human myosin phosphatase targeting/regulatory subunit (MYPT): cdn cloning, tissue expression, and chromosomal mapping. Genomics 49: 59 68, 998. 3. Gong MC, Cohen P, Kitazawa T, Ikebe M, Masuo M, Somlyo P, Somlyo V. Myosin light chain phosphatase activites and the effects of phosphatase inhibitors in tonic and phasic smooth muscle. J Biol Chem 67: 466 4668, 99. 3. Gooding C, Smith CW. Tropomyosin exons as models for alternative splicing. dv Exp Med Biol 644: 7 4, 008. 33. Goren, Ram O, mit M, Keren H, Lev-Maor G, Vig I, Pupko T, st G. Comparative analysis identifies exonic splicing regulatory sequences The complex definition of enhancers and silencers. Mol Cell : 769 78, 006. 34. Grassie ME, Moffat LD, Walsh MP, MacDonald J. The myosin phosphatase targeting protein (MYPT) family: a regulated mechanism for achieving substrate specificity of the catalytic subunit of protein phosphatase type delta. rch Biochem Biophys 50: 47 59, 0. 35. Han YS, Brozovich FV. ltered reactivity of tertiary mesenteric arteries following acute myocardial ischemia. J Vasc Res 50: 00 08, 03. 36. Hardison RC. Conserved noncoding sequences are reliable guides to regulatory elements. Trends Genet 6: 369 37, 000. 37. Hardison RC, Taylor J. Genomic approaches towards finding cisregulatory modules in animals. Nat Rev Genet 3: 469 483, 0. 38. Hartshorne DJ, Ito M, Erdodi F. Role of protein phosphatase type in contractile functions: myosin phosphatase. J Biol Chem 79: 37 374, 004. JP-Regul Integr Comp Physiol doi:0.5/ajpregu.0045.04 www.ajpregu.org

R68 39. Hofmann F, Bernhard D, Lukowski R, Weinmeister P. cgmp regulated protein kinases (cgk). Handbk Exp Pharmacol 37 6, 009. 40. Hong F, Haldeman BD, Jackson D, Carter M, Baker JE, Cremo CR. Biochemistry of smooth muscle myosin light chain kinase. rch Biochem Biophys 50: 35 46, 0. 4. Hu W, Mahavadi S, Li F, Murthy KS. Upregulation of RGS4 and downregulation of CPI-7 mediate inhibition of colonic muscle contraction by interleukin-. m J Physiol Cell Physiol 93: C99 C000, 007. 4. Huang QQ, Fisher S, Brozovich FV. Unzipping the role of myosin light chain phosphatase in smooth muscle cell relaxation. J Biol Chem 79: 597 603, 004. 43. Hunter S, Jones P, Mitchell, pweiler R, ttwood TK, Bateman, Bernard T, Binns D, Bork P, Burge S, de Castro E, Coggill P, Corbett M, Das U, Daugherty L, Duquenne L, Finn RD, Fraser M, Gough J, Haft D, Hulo N, Kahn D, Kelly E, Letunic I, Lonsdale D, Lopez R, Madera M, Maslen J, Mcnulla C, McDowall J, McMenamin C, Mi H, Mutowo-Muellenet P, Mulder N, Natale D, Orengo C, Pesseat S, Punta M, Quinn F, Rivoire C, Sangrador- Vegas, Selengut JD, Sigrist CJ, Scheremetjew M, Tate J, Thimmajanarthanan M, Thomas PD, Wu CH, Yeats C, Yong SY. InterPro in 0: new developments in the family and domain prediction database. Nucleic cids Res 40: D306 D3, 0. 44. International Human Genome Sequencing Consortium, Lander E.S, Linton LM, Birren B, Nusbaum C, Zody MC, Baldwin J, Devon K, Dewar K, Doyle M, FitzHugh W, Funke R, Gage D, Harris K, Heaford, Howland J, Kann L, Lehoczky J, LeVine R, McEwan P, McKernan K, Meldrim J, Mesirov JP, Miranda C, Morris W, Naylor J, Raymond C, Rosetti M, Santos R, Sheridan, Sougnez C, Stange-Thomann N, Stojanovic N, Subramanian, Wyman D, Rogers J, Sulston J, inscough R, Beck S, Bentley D, Burton J, Clee C, Carter N, Coulson, Deadman R, Deloukas P, Dunham, Dunham I, Durbin R, French L, Grafham D, Gregory S, Hubbard T, Humphray S, Hunt, Jones M, Lloyd C, McMurray, Matthews L, Mercer S, Milne S, Mullikin JC, Mungall, Plumb R, Ross M, Shownkeen R, Sims S, Waterston RH, Wilson RK, Hillier LW, McPherson JD, Marra M, Mardis ER, Fulton L, Chinwalla T, Pepin KH, Gish WR, Chissoe SL, Wendl MC, Delehaunty KD, Miner TL, Delehaunty, Kramer JB, Cook LL, Fulton RS, Johnson DL, Minx PJ, Clifton SW, Hawkins T, Branscomb E, Predki P, Richardson P, Wenning S, Slezak T, Doggett N, Cheng JF, Olsen, Lucas S, Elkin C, Uberbacher E, Frazier M, Gibbs R, Muzny DM, Scherer SE, Bouck JB, Sodergren EJ, Worley KC, Rives CM, Gorrell JH, Metzker ML, Naylor SL, Kucherlapati RS, Nelson DL, Weinstock GM, Sakaki Y, Fujiyama, Hattori M, Yada T, Toyoda, Itoh T, Kawagoe C, Watanabe H, Totoki Y, Taylor T, Weissenbach J, Heilig R, Saurin W, rtiguenave F, Brottier P, Bruls T, Pelletier E, Robert C, Wincker P, Smith DR, Doucette-Stamm L, Rubenfield M, Weinstock K, Lee HM, Dubois J, Rosenthal, Platzer M, Nyakatura G, Taudien S, Rump, Yang H, Yu J, Wang J, Huang G, Gu J, Hood L, Rowen L, Madan, Qin S, Davis RW, Federspiel N, bola P, Proctor MJ, Myers RM, Schmutz J, Dickson M, Grimwood J, Cox DR, Olson MV, Kaul R, Shimizu N, Kawasaki K, Minoshima S, Evans G, thanasiou M, Schultz R, Roe B, Chen F, Pan H, Ramser J, Lehrach H, Reinhardt R, McCombie WR, de la Bastide M, Dedhia N, Blocker H, Hornischer K, Nordsiek G, garwala R, ravind L, Bailey J, Bateman, Batzoglou S, Birney E, Bork P, Brown DG, Burge CB, Cerutti L, Chen HC, Church D, Clamp M, Copley RR, Doerks T, Eddy SR, Eichler EE, Furey TS, Galagan J, Gilbert JG, Harmon C, Hayashizaki Y, Haussler D, Hermjakob H, Hokamp K, Jang W, Johnson LS, Jones T, Kasif S, Kaspryzk, Kennedy S, Kent WJ, Kitts P, Koonin EV, Korf I, Kulp D, Lancet D, Lowe TM, McLysaght, Mikkelsen T, Moran JV, Mulder N, Pollara VJ, Ponting CP, Schuler G, Schultz J, Slater G, Smit F, Stupka E, Szustakowski J, Thierry-Mieg D, Thierry-Mieg J, Wagner L, Wallis J, Wheeler R, Williams, Wolf YI, Wolfe KH, Yang SP, Yeh RF, Collins F, Guyer MS, Peterson J, Felsenfeld, Wetterstrand K, Patrinos, Morgan MJ, de Jong P, Catanese JJ, Osoegawa K, Shizuya H, Choi S, Chen YJ. Initial sequencing and analysis of the human genome. Nature 409: 860 9, 00. 45. Ito M, Nakano T, Erdodi F, Hartshorne DJ. Myosin phosphatase: structure, regulation and function. Mol Cell Biochem 59: 97 09, 004. 46. Kalsotra, Cooper T. Functional consequences of developmentally regulated alternative splicing. Nat Rev Genet : 75 79, 0. 47. Kamm KE, Stull JT. Signaling to myosin regulatory light chain in sarcomeres. J Biol Chem 86: 994 9947, 0. 48. Kanopka, Muhlemann O, kusjarvi G. Inhibition by SR proteins of splicing of a regulated adenovirus pre-mrn. Nature 38: 535 538, 996. 49. Karolchik D, Hinrichs S, Furey TS, Roskin KM, Sugnet CW, Haussler D, Kent WJ. The UCSC Table Browser data retrieval tool. Nucleic cids Res 3: D493 D496, 004. 50. Kent WJ, Sugnet CW, Furey TS, Roskin KM, Pringle TH, Zahler M, Haussler D. The human genome browser at UCSC. Genome Res : 996 006, 00. 5. Khatri JJ, Joyce KM, Brozovich FV, Fisher S. Role of myosin phosphatase isoforms in cgmp-mediated smooth muscle relaxation. J Biol Chem 76: 3750 3757, 00. 5. Kim JI, Urban M, Young GD, Eto M. Reciprocal regulation controlling the expression of CPI-7, a specific inhibitor protein for the myosin light chain phosphatase in vascular smooth muscle cells. m J Physiol Cell Physiol 303: C58 C68, 0. 53. Kitazawa T, Kitazawa K. Size-dependent heterogeneity of contractile Ca -sensitization in rat arterial smooth muscle. J Physiol 590: 540 543, 0. 54. Kitazawa T, Polzin N, Eto M. CPI-7-deficient smooth muscle of chicken. J Physiol 557: 55 58, 004. 55. Kudryavtseva O, alkjaer C, Matchkov VV. Vascular smooth muscle cell phenotype is defined by Ca -dependent transcription factors. FEBS J 80: 5488 5499, 03. 56. Landschulz WH, Johnson PF, McKnight SL. The leucine zipper: a hypothetical structure common to a new class of DN binding proteins. Science 40: 759 764, 988. 57. Lee E, Hayes DB, Langsetmo K, Sundberg EJ, Tao TC. Interactions between the leucine-zipper motif of cgmp-dependent protein kinase and the C-terminal region of the targeting subunit of myosin light chain phosphatase. J Mol Biol 373: 98, 007. 58. Lin Q, Buckler ESt Muse SV, Walker JC. Molecular evolution of type serine/threonine protein phosphatases. Mol Phylogenet Evol : 57 66, 999. 59. Liu QR, Zhang PW, Lin Z, Li QF, Woods S, Troncoso J, Uhl GR. GBPI, a novel gastrointestinal- and brain-specific PP-inhibitory protein, is activated by PKC and inactivated by PK. Biochem J 377: 7 8, 004. 60. Liu QR, Zhang PW, Zhen Q, Walther D, Wang XB, Uhl GR. KEPI, a PKC-dependent protein phosphatase inhibitor regulated by morphine. J Biol Chem 77: 33 330, 00. 6. Loots GG, Ovcharenko I. rvist.0 evolutionary analysis of transcription factor binding sites. Nucleic cids Res 3: W7 W, 004. 6. Lu Y, Zhang H, Gokina N, Mandala M, Sato O, Ikebe M, Osol G, Fisher S. Uterine artery myosin phosphatase isoform switching and increased sensitivity to SNP in a rat L-NME model of hypertension of pregnancy. m J Physiol Cell Physiol 94: C564 C57, 008. 63. Mabuchi K, Gong BJ, Langsetmo K, Ito M, Nakano T, Tao T. Isoforms of the small non-catalytic subunit of smooth muscle myosin light chain phosphatase. Biochim Biophys cta 434: 96 303, 999. 64. Marygold SJ, Leyland PC, Seal RL, Goodman JL, Thurmond J, Strelets VB, Wilson RJ. FlyBase: improvements to the bibliography. Nucleic cids Res 4: D75 D757, 03. 65. Matsumura F, Hartshorne DJ. Myosin phosphatase target subunit: many roles in cell function. Biochem Biophys Res Commun 369: 49 56, 008. 66. McNally LM, McNally MT. n RN splicing enhancer-like sequence is a component of a splicing inhibitor element from Rous sarcoma virus. Mol Cell Biol 8: 303 3, 998. 67. Merkin J, Russell C, Chen P, Burge CB. Evolutionary dynamics of gene and isoform regulation in Mammalian tissues. Science 338: 593 599, 0. 68. Mizuno T, Tsutsui K, Nishida Y. Drosophila myosin phosphatase and its role in dorsal closure. Development 9: 5 3, 00. 69. Moorhead G, Johnson D, Morrice N, Cohen P. The major myosin phosphatase in skeletal muscle is a complex between the beta-isoform of protein phosphatase and the MYPT gene product. FEBS Lett 438: 4 44, 998. JP-Regul Integr Comp Physiol doi:0.5/ajpregu.0045.04 www.ajpregu.org

R69 70. Moorhead GB, De Wever V, Templeton G, Kerk D. Evolution of protein phosphatases in plants and animals. Biochem J 47: 40 409, 009. 7. Mueed I, Zhang L, MacLeod KM. Role of the PKC/CPI-7 pathway in enhanced contractile responses of mesenteric arteries from diabetic rats to alpha-adrenoceptor stimulation. Br J Pharmacol 46: 97 98, 005. 7. Nelson C, Wardle FC. Conserved non-coding elements and cis regulation: actions speak louder than words. Development 40: 385 395, 03. 73. Ohama T, Hori M, Momotani E, Iwakura Y, Guo F, Kishi H, Kobayashi S, Ozaki H. Intestinal inflammation downregulates smooth muscle CPI-7 through induction of TNF- and causes motility disorders. m J Physiol Gastrointest Liver Physiol 9: G49 G438, 007. 74. Ohama T, Hori M, Sato K, Ozaki H, Karaki H. Chronic treatment with interleukin- attenuates contractions by decreasing the activities of CPI-7 and MYPT- in intestinal smooth muscle. J Biol Chem 78: 48794 48804, 003. 75. Orstavik S, Natarajan V, Tasken K, Jahnsen T, Sandberg M. Characterization of the human gene encoding the Type Ia and Type Ib cgmp-dependent protein kinase (prkg). Genomics 4: 3 38, 997. 76. Ovcharenko I, Loots GG, Giardine BM, Hou M, Ma J, Hardison RC, Stubbs L, Miller W. Mulan: multiple-sequence local alignment and visualization for studying function and evolution. Genome Res 5: 84 94, 005. 77. Ovcharenko I, Nobrega M, Loots GG, Stubbs L. ECR Browser: a tool for visualizing and accessing data from comparisons of multiple vertebrate genomes. Nucleic cids Res 3: W8-W86, 004. 78. Owens GK, Kumar MS, Wamhoff BR. Molecular regulation of vascular smooth muscle cell differentiation in development and disease. Physiol Rev 84: 767 80, 004. 79. Pal S, Gupta R, Kim H, Wickramasinghe P, Baubet V, Showe LC, Dahmane N, Davuluri RV. lternative transcription exceeds alternative splicing in generating the transcriptome diversity of cerebellar development. Genome Res : 60 7, 0. 80. Pandit SB, Gosar D, bhiman S, Sujatha S, Dixit SS, Mhatre NS, Sowdhamini R, Srinivasan N. SUPFM a database of potential protein superfamily relationships derived by comparing sequence-based and structure-based families: implications for structural genomics and function annotation in genomes. Nucleic cids Res 30: 89 93, 00. 8. Pang H, Guo Z, Xie Z, Su W, Gong MC. Divergent kinase signaling mediates agonist-induced phosphorylation of phosphatase inhibitory proteins PHI- and CPI-7 in vascular smooth muscle cells. m J Physiol Cell Physiol 90: C89 C899, 006. 8. Parmacek MS. Myocardin not quite MyoD. rterioscler Thromb Vasc Biol 4: 535 537, 004. 83. Payne MC, Zhang HY, Prosdocimo T, Joyce KM, Koga Y, Ikebe M, Fisher S. Myosin phosphatase isoform switching in vascular smooth muscle development. J Mol Cell Cardiol 40: 74 8, 006. 84. Payne MC, Zhang HY, Shirasawa Y, Koga Y, Ikebe M, Benoit JN, Fisher S. Dynamic changes in expression of myosin phosphatase in a model of portal hypertension. m J Physiol Heart Circ Physiol 86: H80 H80, 004. 85. Petryszak R, Burdett T, Fiorelli B, Fonseca N, Gonzalez-Porta M, Hastings E, Huber W, Jupp S, Keays M, Kryvych N, McMurry J, Marioni JC, Malone J, Megy K, Rustici G, Tang Y, Taubert J, Williams E, Mannion O, Parkinson HE, Brazma. Expression tlas update a database of gene and transcript expression from microarrayand sequencing-based functional genomics experiments. Nucleic cids Res 4: D96 D93, 04. 86. Pfitzer G, Merkel L, Ruegg JC, Hofmann F. Cyclic GMP-dependent protein kinase relaxes skinned fibers from guinea pig taenia coli but not from chicken gizzard. Pflügers rch 407: 87 9, 986. 87. Plowman GD, Sudarsanam S, Bingham J, Whyte D, Hunter T. The protein kinases of caenorhabditis elegans: a model for signal transduction in multicellular organisms. Proc Natl cad Sci US 96: 3603 360, 999. 88. Pohl, Sugnet CW, Clark T, Smith K, Fujita P, Cline MS. ffy exon tissues: exon levels in normal tissues in human, mouse and rat. Bioinformatics 5: 44 443, 009. 89. Punta M, Coggill PC, Eberhardt RY, Mistry J, Tate J, Boursnell C, Pang N, Forslund K, Ceric G, Clements J, Heger, Holm L, Sonnhammer EL, Eddy SR, Bateman, Finn RD. The Pfam protein families database. Nucleic cids Res 40: D9-D30, 0. 90. Reinke W, Baek J, shenberg O, Keating E. Networks of bzip protein-protein interactions diversified over a billion years of evolution. Science 340: 730 734, 03. 9. Russell P, Feilchenfeldt J, Schreiber S, Praz M, Crettenand, Gobelet C, Meier C, Bell DR, Kralli, Giacobino JP, Deriaz O. Endurance training in humans leads to fiber type-specific increases in levels of peroxisome proliferator-activated receptor-gamma coactivator- and peroxisome proliferator-activated receptor-alpha in skeletal muscle. Diabetes 5: 874 88, 003. 9. Sanford JR, Ellis J, Caceres JF. Multiple roles of arginine/serine-rich splicing factors in RN processing. Biochem Soc Trans 33: 443 446, 005. 93. Sato K, Ohkura S, Kitahara Y, Ohama T, Hori M, Sato M, Kobayashi S, Sasaki Y, Hayashi T, Nasu T, Ozaki H. Involvement of CPI-7 downregulation in the dysmotility of the colon from dextran sodium sulphate-induced experimental colitis in a mouse model. Neurogastroenterol Motil 9: 504 54, 007. 94. Scruggs SB, Solaro RJ. The significance of regulatory light chain phosphorylation in cardiac physiology. rch Biochem Biophys 50: 9 34, 0. 95. Shaye DD, Greenwald I. OrthoList: a compendium of C. elegans genes with human orthologs. PLos One 6: e0085, 0. 96. Shimizu H, Ito M, Miyahara M, Ichikawa K, Okubo S, Konishi T, Naka M, Tanaka T, Hirano K, Hartshorne DJ, Nakano T. Characterization of the myosin-binding subunit of smooth muscle myosin phosphatase. J Biol Chem 69: 30407 304, 994. 97. Shukla S, Fisher S. Trabeta as a novel mediator of vascular smooth muscle diversification. Circ Res 03: 485 49, 008. 98. Sievers F, Wilm, Dineen D, Gibson TJ, Karplus K, Li W, Lopez R, McWilliam H, Remmert M, Soding J, Thompson JD, Higgins DG. Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega. Mol Syst Biol 7: 539, 0. 99. Smith PJ, Zhang C, Wang J, Chew SL, Zhang MQ, Krainer R. n increased specificity score matrix for the prediction of SF/SF-specific exonic splicing enhancers. Hum Mol Genet 5: 490 508, 006. 00. Somlyo P, Somlyo V. Signal transduction and regulation in smooth muscle. Nature 37: 3 36, 994. 0. Surks HK, Mochizuki N, Kasai Y, Georgescu SP, Tang KM, Ito M, Lincoln TM, Mendelsohn ME. Regulation of myosin phosphatase by a specific interaction with cgmp- dependent protein kinase Ialpha. Science 86: 583 587, 999. 0. Thomas PD, Campbell MJ, Kejariwal, Mi H, Karlak B, Daverman R, Diemer K, Muruganujan, Narechania. PNTHER: a library of protein families and subfamilies indexed by function. Genome Res 3: 9 4, 003. 03. Thurman RE, Rynes E, Humbert R, Vierstra J, Maurano MT, Haugen E, Sheffield NC, Stergachis B, Wang H, Vernot B, Garg K, John S, Sandstrom R, Bates D, Boatman L, Canfield TK, Diegel M, Dunn D, Ebersol K, Frum T, Giste E, Johnson K, Johnson EM, Kutyavin T, Lajoie B, Lee BK, Lee K, London D, Lotakis D, Neph S, Neri F, Nguyen ED, Qu H, Reynolds P, Roach V, Safi, Sanchez ME, Sanyal, Shafer, Simon JM, Song L, Vong S, Weaver M, Yan Y, Zhang Z, Lenhard B, Tewari M, Dorschner MO, Hansen RS, Navas P, Stamatoyannopoulos G, Iyer VR, Lieb JD, Sunyaev SR, key JM, Sabo PJ, Kaul R, Furey TS, Dekker J, Crawford GE, Stamatoyannopoulos J. The accessible chromatin landscape of the human genome. Nature 489: 75 8, 0. 04. Venter JC, dams MD, Myers EW, Li PW, Mural RJ, Sutton GG, Smith HO, Yandell M, Evans C, Holt R, Gocayne JD, manatides P, Ballew RM, Huson DH, Wortman JR, Zhang Q, Kodira CD, Zheng XH, Chen L, Skupski M, Subramanian G, Thomas PD, Zhang J, Gabor Miklos GL, Nelson C, Broder S, Clark G, Nadeau J, McKusick V, Zinder N, Levine J, Roberts RJ, Simon M, Slayman C, Hunkapiller M, Bolanos R, Delcher, Dew I, Fasulo D, Flanigan M, Florea L, Halpern, Hannenhalli S, Kravitz S, Levy S, Mobarry C, Reinert K, Remington K, bu-threideh J, Beasley E, Biddick K, Bonazzi V, Brandon R, Cargill M, Chandramouliswaran I, Charlab R, Chaturvedi K, Deng Z, Di Francesco V, Dunn P, Eilbeck K, Evangelista C, Gabrielian E, Gan W, Ge W, Gong F, Gu Z, Guan P, Heiman TJ, Higgins ME, Ji RR, Ke Z, Ketchum K, Lai Z, Lei Y, Li Z, Li J, Liang Y, Lin X, Lu F, Merkulov GV, Milshina N, Moore HM, Naik K, Narayan V, Neelam B, Nusskern D, Rusch DB, Salzberg S, Shao W, Shue B, Sun J, Wang Z, Wang, Wang X, Wang J, Wei M, Wides R, Xiao C, Yan C, Yao, Ye J, Zhan M, JP-Regul Integr Comp Physiol doi:0.5/ajpregu.0045.04 www.ajpregu.org

R70 Zhang W, Zhang H, Zhao Q, Zheng L, Zhong F, Zhong W, Zhu S, Zhao S, Gilbert D, Baumhueter S, Spier G, Carter C, Cravchik, Woodage T, li F, n H, we, Baldwin D, Baden H, Barnstead M, Barrow I, Beeson K, Busam D, Carver, Center, Cheng ML, Curry L, Danaher S, Davenport L, Desilets R, Dietz S, Dodson K, Doup L, Ferriera S, Garg N, Gluecksmann, Hart B, Haynes J, Haynes C, Heiner C, Hladun S, Hostin D, Houck J, Howland T, Ibegwam C, Johnson J, Kalush F, Kline L, Koduru S, Love, Mann F, May D, McCawley S, McIntosh T, McMullen I, Moy M, Moy L, Murphy B, Nelson K, Pfannkoch C, Pratts E, Puri V, Qureshi H, Reardon M, Rodriguez R, Rogers YH, Romblad D, Ruhfel B, Scott R, Sitter C, Smallwood M, Stewart E, Strong R, Suh E, Thomas R, Tint NN, Tse S, Vech C, Wang G, Wetter J, Williams S, Williams M, Windsor S, Winn-Deen E, Wolfe K, Zaveri J, Zaveri K, bril JF, Guigo R, Campbell MJ, Sjolander KV, Karlak B, Kejariwal, Mi H, Lazareva B, Hatton T, Narechania, Diemer K, Muruganujan, Guo N, Sato S, Bafna V, Istrail S, Lippert R, Schwartz R, Walenz B, Yooseph S, llen D, Basu, Baxendale J, Blick L, Caminha M, Carnes-Stine J, Caulk P, Chiang YH, Coyne M, Dahlke C, Mays, Dombroski M, Donnelly M, Ely D, Esparham S, Fosler C, Gire H, Glanowski S, Glasser K, Glodek, Gorokhov M, Graham K, Gropman B, Harris M, Heil J, Henderson S, Hoover J, Jennings D, Jordan C, Jordan J, Kasha J, Kagan L, Kraft C, Levitsky, Lewis M, Liu X, Lopez J, Ma D, Majoros W, McDaniel J, Murphy S, Newman M, Nguyen T, Nguyen N, Nodell M, Pan S, Peck J, Peterson M, Rowe W, Sanders R, Scott J, Simpson M, Smith T, Sprague, Stockwell T, Turner R, Venter E, Wang M, Wen M, Wu D, Wu M, Xia, Zandieh, Zhu X. The sequence of the human genome. Science 9: 304 35, 00. 05. Vilella J, Severin J, Ureta-Vidal, Heng L, Durbin R, Birney E. EnsemblCompara GeneTrees: Complete, duplication-aware phylogenetic trees in vertebrates. Genome Res 9: 37 335, 009. 06. Vinson C, charya, Taparowsky EJ. Deciphering B-ZIP transcription factor interactions in vitro and in vivo. Biochim Biophys cta 759: 4, 006. 07. Wang DZ, Olson EN. Control of smooth muscle development by the myocardin family of transcriptional coactivators. Curr Opin Genet Dev 4: 558 566, 004. 08. Wang ET, Sandberg R, Luo S, Khrebtukova I, Zhang L, Mayr C, Kingsmore SF, Schroth GP, Burge CB. lternative isoform regulation in human tissue transcriptomes. Nature 456: 470 476, 008. 09. Wang M, Marin. Characterization and prediction of alternative splice sites. Gene 366: 9 7, 006. 0. Wang Z, Rolish ME, Yeo G, Tung V, Mawson M, Burge CB. Systematic identification and analysis of exonic splicing silencers. Cell 9: 83 845, 004.. Wasserman WW, Palumbo M, Thompson W, Fickett JW, Lawrence CE. Human-mouse genome comparisons to locate regulatory sites. Nat Genet 6: 5 8, 000.. Waterhouse M, Procter JB, Martin DM, Clamp M, Barton GJ. Jalview Version a multiple sequence alignment editor and analysis workbench. Bioinformatics 5: 89 9, 009. 3. Weirauch MT, Hughes TR. Conserved expression without conserved regulatory sequence: the more things change, the more they stay the same. Trends Genet 6: 66 74, 00. 4. Wingender E, Dietze P, Karas H, Knuppel R. TRNSFC: a database on transcription factors and their DN binding sites. Nucleic cids Res 4: 38 4, 996. 5. Wissmann, Ingles J, Mains PE. The Caenorhabditis elegans mel- myosin phosphatase regulatory subunit affects tissue contraction in the somatic gonad and the embryonic epidermis and genetically interacts with the Rac signaling pathway. Dev Biol 09: 7, 999. 6. Wissmann, Ingles J, McGhee JD, Mains PE. Caenorhabditis elegans LET-50 is related to Rho-binding kinases and human myotonic dystrophy kinase and interacts genetically with a homolog of the regulatory subunit of smooth muscle myosin phosphatase to affect cell shape. Genes Develop : 409 4, 997. 7. Woodsome TP, Eto M, Everett, Brautigan DL, Kitazawa T. Expression of CPI-7 and myosin phosphatase correlates with Ca( ) sensitivity of protein kinase C-induced contraction in rabbit smooth muscle. J Physiol 535: 553 564, 00. 8. Yamawaki K, Ito M, Machida H, Moriki N, Okamoto R, Isaka N, Shimpo H, Kohda, Okumura K, Hartshorne DJ, Nakano T. Identification of human CPI-7, an inhibitory phosphoprotein for myosin phosphatase. Biochem Biophys Res Commun 85: 040 045, 00. 9. Zhang C, Li WH, Krainer R, Zhang MQ. RN landscape of evolution for optimal exon and intron discrimination. Proc Natl cad Sci US 05: 5797 580, 008. 0. Zhang H, Fisher S. Conditioning effect of blood flow on resistance artery smooth muscle myosin phosphatase. Circ Res 00: 730 737, 007.. Zhang XHF, Chasin L. Computational definition of sequence motifs governing constitutive exon splicing. Genes Develop 8: 4 50, 004. JP-Regul Integr Comp Physiol doi:0.5/ajpregu.0045.04 www.ajpregu.org