CRAM (file format)

CRAM
Filename extension
.cram
Developed byMarkus Hsi-Yang Fritz et al; Vadim Zalunin; James Bonfield
Type of formatBioinformatics
Open format?yes
Websitewww.ga4gh.org/cram/, www.ebi.ac.uk/ena/software/cram-toolkit

Compressed Reference-oriented Alignment Map (CRAM) is a compressed columnar file format for storing biological sequences aligned to a reference sequence, initially devised by Markus Hsi-Yang Fritz et al.[1]

CRAM was designed to be an efficient reference-based alternative to the Sequence Alignment Map (SAM) and Binary Alignment Map (BAM) file formats. It optionally uses a genomic reference to describe differences between the aligned sequence fragments and the reference sequence, reducing storage costs. Additionally each column in the SAM format is separated into its own blocks, improving compression ratio. CRAM files typically vary from 30 to 60% smaller than BAM, depending on the data held within them.

Implementations of CRAM exist in htsjdk,[2] htslib,[3] JBrowse,[4] and Scramble.[5]

The file format specification is maintained by the Global Alliance for Genomics and Health (GA4GH)[6] with the specification document available from the EBI cram toolkit page.[7]

File format

The basic structure of a CRAM file is a series of containers, the first of which holds a compressed copy of the SAM header. Subsequent containers consist of a container Compression Header followed by a series of slices which in turn hold the alignment records themselves, formatted as a series of blocks.

CRAM file:

Magic number Container
(SAM header)
Container
(Data)
... Container
(Data)
Container
(EOF)

Container:

Container
Header
Compression
Header
Slice ... Slice

Slice:

Slice
Header
Block Block ... Block

CRAM constructs records from a set of data series, describing the components of an alignment. The container Compression Header specifies which data series is encoded in which block, what codec will be used, and any codec specific meta-data (for example a table of Huffman symbol code lengths). While data series can be mixed together within the same block, keeping them separate usually improves compression and provides the opportunity for efficient selective decoding where only some data types are required.

Selective access to a CRAM file is granted via the index (with file-name suffix ".crai"). On chromosome and position sorted data this indicates which region is covered by each slice. On unsorted data the index may be used to simply fetch the Nth container. Selective decoding may also be achieved using the Compression Header to skip specified data series if partial records are required.

History

Year Version(s) Notes
2010-11 pre-CRAM Initial paper describing the reference based format. This did not use the name CRAM, but called it mzip. This software was implemented in Python as a prototype and demonstration of the basic concepts.[1]
2011-12 0.3–0.86 Vadim Zalunin of the European Bioinformatics Institute (EBI) produced the first implementation named CRAM as a package called CRAMtools,[8] written in the Java programming language.
2012 1.0[9] Implemented in Java CRAMtools.[10]
2013 C implementation added to the Scramble[11][5] tool, by James Bonfield of the Wellcome Sanger Institute.
2013 2.0 Changes included support for more than one reference per slice (useful with highly fragmented assemblies), better encoding of SAM auxiliary tags, splitting soft-clip and inserted bases into their own data-series, meta-data to track the number of records and bases per slice, and corrections to the BF (BAM flag) data-series.
2013 Added to htslib (0.2.0).
2014 2.1[12] Added EOF blocks, to help identify truncated files.
2014 Added to htsjdk (1.127).
2014 3.0[13] Inclusion of lzma and rANS codecs for block compression, along with multiple checksums for ensuring data integrity
2018 JavaScript implementation as part of JBrowse[4] (1.15.0), by Rob Buels.
2021 Rust implementation in Noodles[14]
2023 3.1[15] Officially adopted. (Draft from 2019)

CRAM version 4.0 exists as a prototype in Scramble,[5] initially demonstrated in 2015, but has yet to be adopted as a standard.

See also

References

  1. ^ a b Hsi-Yang Fritz, Markus; Leinonen, Rasko; Cochrane, Guy; Birney, Ewan (May 2011). "Efficient storage of high throughput DNA sequencing data using reference-based compression". Genome Research. 21 (5): 734–740. doi:10.1101/gr.114819.110. ISSN 1549-5469. PMC 3083090. PMID 21245279.
  2. ^ "Htsjdk by Broad Institute". samtools.github.io. Retrieved 2018-10-14.
  3. ^ "Samtools". www.htslib.org. Retrieved 2018-10-14.
  4. ^ a b "JBrowse · A fast, embeddable genome browser built with HTML5 and JavaScript". jbrowse.org. Retrieved 2018-10-14.
  5. ^ a b c Bonfield, James K. (2014-06-14). "The Scramble conversion tool". Bioinformatics. 30 (19): 2818–2819. doi:10.1093/bioinformatics/btu390. ISSN 1460-2059. PMC 4173023. PMID 24930138.
  6. ^ "GA4GH". www.ga4gh.org. Retrieved 2018-10-14.
  7. ^ EMBL-EBI. "CRAM toolkit < Software < European Nucleotide Archive < EMBL-EBI". www.ebi.ac.uk. Retrieved 2018-10-14.
  8. ^ "vadimzalunin/crammer". GitHub. 2017-08-08. Retrieved 2018-10-14.
  9. ^ "CRAM 1.0 Specification" (PDF).
  10. ^ "enasequence/cramtools". GitHub. 2018-10-02. Retrieved 2018-10-14.
  11. ^ "jkbonfield/io_lib". GitHub. 2018-10-16. Retrieved 2018-10-14.
  12. ^ "CRAM 2.1 Specification" (PDF).
  13. ^ "CRAM 3.0 Specification" (PDF).
  14. ^ https://github.com/zaeleus/noodles/ [bare URL]
  15. ^ "CRAM 3.1 Specification" (PDF).

Read other articles:

Luis Fernando Suárez Informasi pribadiNama lengkap Luis Fernando SuárezTanggal lahir 23 Desember 1959 (umur 64)Tempat lahir Medellín, KolombiaPosisi bermain BekKarier senior*Tahun Tim Tampil (Gol)1987–1993 Atlético Nacional 1994–1995 Deportivo Pereira Kepelatihan1999–2000 Atlético Nacional2001 Deportivo Cali2001 Deportes Tolima2003–2004 Aucas2004–2007 Ekuador2008 Aucas2009 Atlético Nacional2009–2010 Juan Aurich2011– Honduras * Penampilan dan gol di klub senior hanya ...

 

 

Diminished responsiveness to a stimulus after repeated exposure This article has multiple issues. Please help improve it or discuss these issues on the talk page. (Learn how and when to remove these template messages) This article may require copy editing for grammar, style, cohesion, tone, or spelling. You can assist by editing it. (January 2024) (Learn how and when to remove this template message) This article needs additional citations for verification. Please help improve this article by ...

 

 

Ini adalah nama Korea; marganya adalah Bang. Sung JoonLahirBang Sung-joon10 Juli 1990 (umur 33)Korea SelatanNama lainSung JunPekerjaanAktor, modelTahun aktif2011-sekarangAgenO& EntertainmentNama KoreaHangul성준 Hanja盛駿 Alih AksaraSeong JunMcCune–ReischauerSŏng JunNama lahirHangul방성준 Alih AksaraBang Seong-junMcCune–ReischauerPang Sŏng-jun Sung Joon (lahir Bang Sung-joon lahir 10 Juli 1990) adalah aktor asal Korea Selatan. Ia memulai kariernya di dunia hibur...

بيريستيرا تقسيم إداري البلد اليونان  [1] خصائص جغرافية إحداثيات 40°32′54″N 23°09′54″E / 40.5483°N 23.165°E / 40.5483; 23.165   الارتفاع 570 متر  السكان التعداد السكاني 910 (resident population of Greece) (2001)713 (resident population of Greece) (1991)651 (resident population of Greece) (2021)770 (resident population of Greece) (2011)  معلوما...

 

 

Untuk kegunaan lain, lihat SBI. Stasiun Surabaya Pasarturi A14B01SI14 Tampak depan Stasiun Surabaya Pasarturi pada tahun 2022 beserta monumen lokomotif B1239Nama lainStasiun Pasar TuriLokasiJalan Semarang No. 1Gundih, Bubutan, Surabaya, Jawa Timur 60172IndonesiaKoordinat7°14′40″S 112°43′56″E / 7.24444°S 112.73222°E / -7.24444; 112.73222Koordinat: 7°14′40″S 112°43′56″E / 7.24444°S 112.73222°E / -7.24444; 112.73222Ketinggia...

 

 

Johann Anton Graf von Pergen Johann Anton Graf von Pergen (15 Februari 1725 di Wina - 12 Mei 1814 juga di kota Wina) adalah seorang diplomat dan negarawan di pemerintahan Monarki Habsburg. Ia mengabdi untuk empat penguasa monarki selama lebih dari lima puluh tahun. Ia juga merupakan salah satu tokoh yang paling berpengaruh di dalam pemerintahan reformis Kaisar Joseph II (1780-1790). Sebagai menteri negara, beberapa pencapaiannya adalah modernisasi pendidikan tinggi dan penindasan pengaruh keu...

Tendency of an atom to attract a shared pair of electrons Electronegative redirects here. For the Nightfall EP, see Electronegative (EP). Electrostatic potential map of a water molecule, where the oxygen atom has a more negative charge (red) than the positive (blue) hydrogen atoms Electronegativity, symbolized as χ, is the tendency for an atom of a given chemical element to attract shared electrons (or electron density) when forming a chemical bond.[1] An atom's electronegativity is ...

 

 

Chief organ under the Central Military Commission Office for Strategic Planning of the Central Military Commission中央军事委员会战略规划办公室Agency overviewFormed2016TypeAdministrative agencyJurisdictionPeople's Liberation ArmyHeadquartersMinistry of National Defense compound (August 1st Building), BeijingAgency executiveWang Huiqing, DirectorParent departmentCentral Military CommissionWebsitechinamil.com.cn The Office for Strategic Planning of the Central Military Commission &...

 

 

American professor emeritus of medicine Jon Kabat-ZinnKabat-Zinn in 2018BornJon Kabat (1944-06-05) June 5, 1944 (age 79)New York City, New York, U.S.NationalityAmericanAlma materMassachusetts Institute of TechnologyHaverford CollegeKnown forFounder of Mindfulness-Based Stress ReductionParent(s)Elvin KabatSally Kabat Jon Kabat-Zinn (born Jon Kabat, June 5, 1944) is an American professor emeritus of medicine and the creator of the 'Stress Reduction Clinic' and the 'Center for Min...

IrisPoster film IrisSutradaraRichard EyreProduserScott RudinRobert FoxDitulis olehRichard EyreCharles WoodBerdasarkanElegy for Irisoleh John BayleyPemeranJudi DenchKate WinsletJim BroadbentPenata musikJames HornerSinematograferRoger PrattPenyuntingMartin WalshPerusahaanproduksiBBC FilmsFox Iris ProductionsIntermediaMirage EnterprisesDistributorBuena Vista International(Britania Raya)Miramax Films(Amerika Serikat)Tanggal rilis 14 Desember 2001 (2001-12-14) (Los Angeles) 18 Janua...

 

 

Barbizon Une rue de la commune. Blason Administration Pays France Région Île-de-France Département Seine-et-Marne Arrondissement Fontainebleau Intercommunalité Communauté d'agglomération du Pays de Fontainebleau Maire Mandat Gerard Taponat 2020-2026 Code postal 77630 Code commune 77022 Démographie Gentilé Barbizonnais Populationmunicipale 1 241 hab. (2021 ) Densité 235 hab./km2 Géographie Coordonnées 48° 26′ 48″ nord, 2° 36′ 20″...

 

 

Polish Roman Catholic cardinal (1901–1981) This article needs additional citations for verification. Please help improve this article by adding citations to reliable sources. Unsourced material may be challenged and removed.Find sources: Stefan Wyszyński – news · newspapers · books · scholar · JSTOR (December 2023) (Learn how and when to remove this message) In this article, the surname is Wyszyński, not Wyszynski. BlessedStefan WyszyńskiCardinalA...

Italian aircraft carrier For other ships with the same name, see Italian ship Aquila. This article needs additional citations for verification. Please help improve this article by adding citations to reliable sources. Unsourced material may be challenged and removed.Find sources: Italian aircraft carrier Aquila – news · newspapers · books · scholar · JSTOR (February 2022) (Learn how and when to remove this message) RN Aquila at La Spezia in 1951, just ...

 

 

Death of children under the age of 1 World map of infant mortality rates in 2017 Infant mortality is the death of an infant before the infant's first birthday.[1] The occurrence of infant mortality in a population can be described by the infant mortality rate (IMR), which is the number of deaths of infants under one year of age per 1,000 live births.[1] Similarly, the child mortality rate, also known as the under-five mortality rate, compares the death rate of children up to t...

 

 

American geneticist For persons of a similar name, see Mark Shriver (disambiguation). Mark D. ShriverShriver in 2013Alma materState University of New YorkUniversity of Texas Health Science CenterScientific careerFieldsPopulation geneticsInstitutionsPennsylvania State UniversityMorehouse College Mark D. Shriver is an American population geneticist. He leads genetic research at the Pennsylvania State University.[1] Education Shriver studied Biology at the State University of New Yo...

趙紫陽1985年的赵紫阳 中国共产党中央委員會總書記选举:1982、1987[1]任期中国共产党第十二届中央委员会中国共产党第十三届中央委员会任期1987年1月16日—1989年6月24日中央政治局常委 第十二届(1987—1987) 赵紫阳、邓小平、李先念、陈云、胡耀邦 第十三届(1987-1989) 赵紫阳、李鹏、乔石、胡启立、姚依林 前任胡耀邦继任江泽民 第3任中华人民共和国国务院总�...

 

 

Women's freestyle 51 kg at the 1999 World ChampionshipsVenueHildursborgDates10–12 September 1999Competitors15 from 15 nationsMedalists  Seiko Yamamoto   Japan Erica Sharp   Canada Gao Yanzhi   China← 19982000 → 1999 World Wrestling ChampionshipsFreestyleGreco-RomanWomen54 kg54 kg46 kg58 kg58 kg51 kg63 kg63 kg56 kg69 kg69 kg62 kg76 kg76 kg68 kg85 kg85 kg75 kg97 kg97 kg130 kg130 kgvte Main article: 1999 Wo...

 

 

State flag of West Virginia Location of West Virginia on the U.S. map This is a list of prominent people from the territory that now makes up the U.S. state of West Virginia. This is a dynamic list and may never be able to satisfy particular standards for completeness. You can help by adding missing items with reliable sources. Athletes A–G Jesse Burkett Mike D'Antoni Hal Greer Michael Barber, professional football player Randy Barnes (born 1966), shotputter Larry Barnett (born 1945), ...

Questa voce sull'argomento calciatori azeri è solo un abbozzo. Contribuisci a migliorarla secondo le convenzioni di Wikipedia. Segui i suggerimenti del progetto di riferimento. Məqsəd İsayevNazionalità Azerbaigian Altezza168 cm Calcio RuoloDifensore Squadra Zirə CarrieraGiovanili  Neftçi Baku Squadre di club1 2013-2018 Neftçi Baku77 (0)2018 Keşlə6 (1)2018-2020 Sabah39 (1)2020-2021 Səbail20 (0)2021-2023 Qəbələ61 (3)2023- Zirə12 ...

 

 

Castello nuovo o VisconteoCastello Nuovo o Visconteo-VenostaUbicazioneStato Italia RegioneLombardia CittàGrosio IndirizzoPendio N, a S di Grosio Coordinate46°17′31.48″N 10°15′50.13″E46°17′31.48″N, 10°15′50.13″E Informazioni generaliTipocastello Costruzione13250-1370 Demolizione1635 Condizione attualerudere voci di architetture militari presenti su Wikipedia Modifica dati su Wikidata · Manuale Il castello nuovo o Visconteo-Venosta sorge nel comune di Grosio, in ...