Share to: share facebook share twitter share wa share telegram print page

DjVu

DjVu
Filename extensions
.djvu, .djv
Internet media type
image/vnd.djvu, image/x-djvu
Magic numberAT&T
Developed byAT&T Labs – Research
Initial release1998; 26 years ago (1998)
Latest release
Version 26[1]
April 2005; 19 years ago (2005-04)
Type of formatImage file formats
Contained byInterchange File Format
Open format?Yes

DjVu[a] is a computer file format designed primarily to store scanned documents, especially those containing a combination of text, line drawings, indexed color images, and photographs. It uses technologies such as image layer separation of text and background/images, progressive loading, arithmetic coding, and lossy compression for bitonal (monochrome) images. This allows high-quality, readable images to be stored in a minimum of space, so that they can be made available on the web.

DjVu has been promoted as providing smaller files than PDF for most scanned documents.[3] The DjVu developers report that color magazine pages compress to 40–70 kB, black-and-white technical papers compress to 15–40 kB, and ancient manuscripts compress to around 100 kB; a satisfactory JPEG image typically requires 500 kB.[4] Like PDF, DjVu can contain an OCR text layer, making it easy to perform copy and paste and text search operations.

Free creators, manipulators, converters, web browser plug-ins, and desktop viewers are available.[2] DjVu is supported by a number of multi-format document viewers and e-book reader software on Linux (Okular, Evince, Zathura), Windows (Okular, SumatraPDF), and Android (Document Viewer,[5] FBReader, EBookDroid, PocketBook).

History

The DjVu technology was originally developed by Yann LeCun, Léon Bottou, Patrick Haffner, Paul G. Howard, Patrice Simard, and Yoshua Bengio at AT&T Labs from 1996 to 2001.[4]

Prior to the standardization of PDF in 2008,[6][7] DjVu had been considered superior due to it being an open file format in contrast to the proprietary nature of PDF at the time. The declared higher compression ratio (and thus smaller file size), and the claimed ease of converting large volumes of text into DjVu format, were other arguments for DjVu's superiority over PDF in the technology landscape of 2004. Independent technologist Brewster Kahle in a 2004 talk on IT Conversations discussed the benefits of allowing easier access to DjVu files.[8][9]

The DjVu library distributed as part of the open-source package DjVuLibre has become the reference implementation for the DjVu format. DjVuLibre has been maintained and updated by the original developers of DjVu since 2002.[10]

The DjVu file format specification has gone through a number of revisions, the most recent being from 2005.

Revision history
Version Release date Notes
Old version, no longer maintained: 1–19[citation needed] 1996–1999 Developmental versions by AT&T labs preceding the sale of the format to LizardTech.
Old version, no longer maintained: Version 20[1] April 1999 DjVu version 3. DjVu changed from a single-page format to a multipage format.
Older version, yet still maintained: Version 21[1] September 1999 Indirect storage format replaced. The searchable text layer was added.
Older version, yet still maintained: Version 22[1] April 2001 Page orientation, color JB2
Old version, no longer maintained: Version 23[1] July 2002 CID chunk
Old version, no longer maintained: Version 24[1] February 2003 LTAnno chunk
Older version, yet still maintained: Version 25[1] May 2003 NAVM chunk. Support for DjVu bookmarks (outlines) was added. Changes made by Versions 23 and 24 were made obsolete.
Current stable version: Version 26[1] April 2005 Text/line annotations
Legend:
Old version
Older version, still maintained
Latest version
Latest preview version
Future release

Role in the software ecosystem

The primary usage of the DjVu format has been the electronic distribution of documents with a quality comparable to that of printed documents. As that niche is also the primary usage for PDF, it was inevitable that the two formats would become competitors. It should however be observed that the two formats approach the problem of delivering high resolution documents in very different ways: PDF primarily encodes graphics and text as vectorised data, whereas DjVu primarily encodes them as pixmap images. This means PDF places the burden of rendering the document on the reader, whereas DjVu places that burden on the creator.

During a number of years, significantly overlapping with the period when DjVu was being developed, there were no PDF viewers for free operating systems—a particular stumbling block was the rendering of vectorised fonts, which are essential for combining small file size with high resolution in PDF. Since displaying DjVu was a simpler problem for which free software was available, there were suggestions that the free software movement should employ DjVu instead of PDF for distributing documentation; rendering for creating DjVu is in principle not much different from rendering for a device-specific printer driver, and DjVu can as a last resort be generated from scans of paper media. However, when FreeType 2.0 in 2000 began to provide rendering of all major vectorised font formats, that specific advantage of DjVu began to erode.

In the 2000s, with the growth of the World Wide Web and before widespread adoption of broadband, DjVu was often adopted by digital libraries as their format of choice, thanks to its integration with software like Greenstone[11] and the Internet Archive,[12] browser plugins which allowed advanced online browsing, smaller file size for comparable quality of book scans and other image-heavy documents[13] and support for embedding and searching full text from OCR.[14][15] Some features such as the thumbnail previews were later integrated in the Internet Archive's BookReader[16] and DjVu browsing was deprecated in its favour as around 2015 some major browsers stopped supporting NPAPI and DjVu plugins with them.[17]

DjVu.js Viewer attempts to replace the missing plugins.

Technical overview

File structure

The DjVu file format is based on the Interchange File Format and is composed of hierarchically organized chunks. The IFF structure is preceded by a 4-byte AT&T magic number. Following is a single FORM chunk with a secondary identifier of either DJVU or DJVM for a single-page or a multi-page document, respectively.

All the chunks can be contained in a single file in the case of the so called bundled documents, or can be contained in several files: one file for every page plus some files with shared chunks.

Chunk types

Chunk types in DjVu files
Chunk identifier Contained by Description
FORM:DJVU FORM:DJVM Describes a single page. Can either be at the root of a document and be a single-page document or referred to from a DIRM chunk.
FORM:DJVM Describes a multi-page document. Is the document's root chunk.
FORM:DJVI FORM:DJVM Contains data shared by multiple pages.
FORM:THUM FORM:DJVM Contains thumbnails.
INFO FORM:DJVU Must be the first chunk. Describes the page width, height, format version, resolution, gamma, and rotation.
DIRM FORM:DJVM Must be the first chunk. References other FORM chunks. These chunks can either follow this chunk inside the FORM:DJVM chunk or be contained in external files. These types of documents are referred to as bundled or indirect, respectively.
NAVM FORM:DJVM If present, must immediately follow the DIRM chunk. Contains a BZZ-compressed outline of the document.
ANTa, ANTz FORM:DJVI or FORM:DJVU Annotations.
TXTa, TXTz FORM:DJVU Unicode text and layout information.
INCL FORM:DJVU The ID of an included FORM::DJVI chunk.
Sjbz FORM:DJVU BZZ compressed JB2 bitonal data used to store mask.
Djbz FORM:DJVI or FORM:DJVU Shared shape table.
WMRM ? JB2 data required to remove a watermark.
CIDa FORM:DJVU Obsolete chunk with unknown content.

Compression

DjVu divides a single image into many different images, then compresses them separately. To create a DjVu file, the initial image is first separated into three images: a background image, a foreground image, and a mask image. The background and foreground images are typically lower-resolution color images (e.g., 100 dpi); the mask image is a high-resolution bilevel image (e.g., 300 dpi) and is typically where the text is stored. The background and foreground images are then compressed using a wavelet-based compression algorithm named IW44.[4] The mask image is compressed using a method called JB2 (similar to JBIG2). The JB2 encoding method identifies nearly identical shapes on the page, such as multiple occurrences of a particular character in a given font, style, and size. It compresses the bitmap of each unique shape separately, and then encodes the locations where each shape appears on the page. Thus, instead of compressing a letter "e" in a given font multiple times, it compresses the letter "e" once (as a compressed bit image) and then records every place on the page it occurs.

Optionally, these shapes may be mapped to UTF-8 codes (either by hand or potentially by a text recognition system) and stored in the DjVu file. If this mapping exists, it is possible to select and copy text.

Since JB2 (also called DjVuBitonal) is a variation on JBIG2, working on the same principles,[18] both compression methods have the same problems when performing lossy compression. In 2013 it emerged that Xerox photocopiers and scanners had been substituting digits for similar looking ones, for example replacing a 6 with an 8.[19] A DjVu document has been spotted in the wild with character substitutions, such as an n with bleeding serifs turning into a u and an o with a spot inside turning into an e.[20] Whether lossy compression has occurred is not stored in the file.[1] Thus the DjView viewing application can't warn the user that glyph substitutions might have occurred, neither when opening a lossy compressed file, nor in the Information or Metadata dialogue boxes.[21]

Format licensing

DjVu is an open file format with patents.[3] The file format specification is published, as well as source code for the reference library.[3] The original authors distribute an open-source implementation named "DjVuLibre" under the GNU General Public License and a patent grant.[22] The rights to the commercial development of the encoding software have been transferred to different companies over the years, including AT&T Corporation, LizardTech,[23] Celartem[24] and Cuminas.[25] Patents typically have an expiry term of about 20 years.

Celartem acquired LizardTech and Extensis.[26][27][24][28][29]

Support

The selection of downloadable DjVu viewers is wider on Linux distributions than it is on Windows or Mac OS. Additionally, the format is rarely supported by proprietary scanning software.

In 2002, the DjVu file format was chosen by the Internet Archive as a format in which its Million Book Project provides scanned public-domain books online (along with TIFF and PDF).[30] In February 2016, the Internet Archive announced that DjVu would no longer be used for new uploads, among other reasons citing the format's declining use and the difficulty of maintaining their Java applet based viewer for the format.[17]

Wikimedia Commons, a media repository used by Wikipedia among others, conditionally permits PDF and DjVu media files.[31]

See also

Notes

  1. ^ Although usually pronounced as an initialism "D-J-V-U", the file type was intended to have the pronunciation DAY-zhah-VOO (/ˌdʒɑːˈv/) after French déjà vu.[2]

References

  1. ^ a b c d e f g h i "Lizardtech DjVu Reference" (PDF). Cuminas.jp. p. 25. Retrieved 7 December 2021.
  2. ^ a b "DjVu.org – the premier menu for djvu resources". djvu.org. Archived from the original on 2017-06-29. Retrieved 2017-07-02.{{cite web}}: CS1 maint: unfit URL (link)
  3. ^ a b c "What is DjVu – DjVu.org". DjVu.org. Archived from the original on 2019-01-21. Retrieved 2009-03-05.
  4. ^ a b c Léon Bottou; Patrick Haffner; Paul G. Howard; Patrice Simard; Yoshua Bengio; Yann Le Cun (1998). "High Quality Document Image Compression with DjVu, 7(3):410–425" (PDF). Journal of Electronic Imaging.
  5. ^ Document Viewer, Sufficiently Secure, 2022-04-04, retrieved 2022-04-09
  6. ^ "ISO 32000-1:2008 – Document management – Portable document format – Part 1: PDF 1.7". Iso.org. 2008-07-01. Retrieved 2010-02-21.
  7. ^ Orion, Egan (2007-12-05). "PDF 1.7 is approved as ISO 32000". The Inquirer. Incisive Media. Archived from the original on December 13, 2007. Retrieved 2007-12-05.
  8. ^ Brewster Kahle (December 16, 2004). "Universal Access to All Knowledge" (Audio; Speech at 1h:31 m:20s). Conversations Network.
  9. ^ "LizardTech To Open Source A DjVu Java Viewer". ECM Connection. 7 December 2004. Retrieved 18 August 2017.
  10. ^ "DjVuLibre: Open Source DjVu library and viewer". djvu.sourceforge.net.
  11. ^ "nzdl:projects - Greenstone". Wiki.greenstone.org. Retrieved 7 December 2021.
  12. ^ Eric Rumsey (2018-09-05). "Google Books vs DjVu in Internet Archive". Blog.libuiowa.edu. Archived from the original on 2018-08-22. Retrieved 2018-08-21.
  13. ^ Eric Rumsey (2018-09-10). "DjVu again". Blog.libuiowa.edu.
  14. ^ Jeff Kaplan (2004-12-09). "New book collection: color scans, djvu, some pdf" (PDF). Blog.archive.org.
  15. ^ Janusz S. Bień (2011-09-12). "Efficient search in hidden text of large DjVu documents". Advanced Language Technologies for Digital Libraries (PDF). Lecture Notes in Computer Science. Vol. 6699. pp. 1–14. doi:10.1007/978-3-642-23160-5_1. ISBN 978-3-642-23159-9. S2CID 3095526.
  16. ^ Eric Rumsey (2010-09-10). "Internet Archive's BookReader Thumbnail View". Blog.libuiowa.edu.
  17. ^ a b Brewster Kahle; Jeff Kaplan (2016-02-26). "DjVu files for new uploads". Archive.org.
  18. ^ Artem Mikheev, Luc Vincent, Mike Hawrylycz & Léon Bottou: Electronic Document Publishing Using DjVu
  19. ^ See the JBIG2 article for more details and references.
  20. ^ "This document caused me a fair bit of consternation transcribing it on a site th... | Hacker News". News.ycombinator.com. Retrieved 7 December 2021.
  21. ^ "DjVuLibre". SourceForge.net. Retrieved 7 December 2021.
  22. ^ "DjVuLibre: Open Source DjVu library and viewer".
  23. ^ Extensis. "Company – About – LizardTech". Lizardtech.com.
  24. ^ a b "Celartem, Inc.: Private Company Information – Bloomberg". Bloomberg.com.
  25. ^ "会社情報 - Cuminas Corporation". Cuminas.jp. Archived from the original on 2018-01-15. Retrieved 2018-01-14.
  26. ^ "Company Overview – Celartem Technology, Inc". Celartem.com. Archived from the original on 27 May 2019. Retrieved 7 December 2021.
  27. ^ "Celartem Technology Announces Merger of US Holdings – Extensis.com". Archived from the original on 2018-01-15. Retrieved 2018-01-14.
  28. ^ "Celartem Technology Inc.: Private Company Information – Bloomberg". Bloomberg.com.
  29. ^ "Celartem Sells Extensis and LizardTech Plugins and XTensions to onOne Software – Big Picture – Wide Format Printing". bigpicture.net. 28 July 2005.
  30. ^ "Image file formats – OLPC". Wiki.laptop.org. Retrieved 2008-09-09.
  31. ^ Wikimedia Commons. Project scope: PDF and DjVu.

This information is adapted from Wikipedia which is publicly available.

Read other articles:

NHK紅白歌合戦 > 第18回NHK紅白歌合戦 第18回NHK紅白歌合戦 会場の東京宝塚劇場(写真は太平洋戦争以前)ジャンル 大型音楽番組製作制作 NHK 放送放送国・地域 日本放送期間1967年12月31日回数NHK紅白歌合戦第18 NHK紅白歌合戦公式サイトテンプレートを表示 第18回NHK紅白歌合戦ジャンル 大型音楽番組放送方式 生放送放送期間 1967年12月31日放送時間 1967年12月31日放送局 NHKラ…

Stomiiformes Vinciguerria attenuata Klasifikasi ilmiah Kerajaan: Animalia Filum: Chordata Kelas: Actinopterygii Superordo: Osmeromorpha Ordo: StomiiformesRegan, 1909 Famili Lihat teks Sinonim Gonostomatiformes Stomiiformes adalah ordo ikan bersirip kipas laut dalam dengan morfologi yang sangat beragam.[1] Nama ordonya berarti berbentuk Stomias, Stomias sendiri merupakan salah satu genusnya. Penamaannya berasal dari kata dalam bahasa Yunani Kuno yakni στόμᾶ (stóma atau mulut) + kat…

Deliry & Fils Rechtsform Gründung 1900 Auflösung 1901 Sitz Soissons, Frankreich Branche Automobilindustrie Colliot von 1901 Colliot von 1901 Colliot war eine französische Automarke.[1][2][3] Inhaltsverzeichnis 1 Unternehmensgeschichte 2 Fahrzeuge 3 Literatur 4 Weblinks 5 Einzelnachweise Unternehmensgeschichte Das Unternehmen Deliry & Fils begann 1900 in Soissons mit der Produktion von Automobilen. 1902 endete die Produktion. Fahrzeuge Das einzige Modell besaß …

Este artículo se refiere o está relacionado con un evento de salud pública reciente o actualmente en curso. La información de este artículo puede cambiar frecuentemente. Por favor, no agregues datos especulativos y recuerda colocar referencias a fuentes fiables para dar más detalles. Artículo principal: Pandemia de COVID-19 Pandemia de COVID-19 en Polonia Parte de la pandemia de COVID-19 Mapa de la voivodatos con casos confirmados de COVID-19 (al 17 de Noviembre de 2020):  …

Hauptwerke der ottonischen Buchmalerei sind diejenigen illuminierten Manuskripte, die in ottonischer Zeit im ostfränkischen Reich entstanden und in der kunstgeschichtlichen Literatur als Werke von besonderem künstlerischem Rang herausgestellt werden (siehe besonders die in der Literaturliste angegebenen Gesamtübersichten). Die ottonische Buchmalerei folgt auf die karolingische und geht in die romanische Buchmalerei über. Zehn Werke der Reichenauer Schule wurden 2003 von der UNESCO in die Lis…

Опис файлу ✓ Вихідний код цього Scalable Vector Graphics — правильний. Обґрунтування добропорядного використання для статті «Ліга конференцій УЄФА» [?] Опис This is a logo owned by УЄФА for UEFA Europa Conference League branding and identification. Джерело The logo may be obtained from УЄФА. Мета використання The image is used to identify th…

خريطة البلدان التي لديها بعثة دبلوماسية فيسويسرا  سويسرا   بلدان لها سفارات في برن   بعثات إلى الأمم المتحدة تشتغل كسفارات إلى سويسرا   بلدان لها قنصليات عامة فقط في جنيف تعرض هذه الصفحة قائمة بأسماء الدول التي توجد لديها بعثات دبلوماسية إلى سويسرا. حالي

Historic swimming pool in Hepburn Springs, Victoria, Australia The Swimming Pool, circa 1938. Hepburn Pool is a historic, pre-Olympic[1] swimming pool built into Spring Creek within the Hepburn Mineral Springs Reserve in Hepburn Springs. It was included on the Victorian Heritage Register following a nomination and comparative analysis of pre Olympic Swimming Pools in Victoria by Lisa Gervasoni.[2] It was built in the early 1930s for Bellinzona Guest House, which was operated by t…

Average value of a random variable This article is about the term used in probability theory and statistics. For other uses, see Expected value (disambiguation). E(X) redirects here. For the e x {\displaystyle e^{x}} function, see Exponential function. E value redirects here. For other uses, see E-Series (disambiguation). Part of a series on statisticsProbability theory Probability Axioms Determinism System Indeterminism Randomness Probability space Sample space Event Collectively exhaustive eve…

متلازمة ويفر معلومات عامة الاختصاص علم الوراثة الطبية  من أنواع عيب خلقي  تعديل مصدري - تعديل   متلازمة ويفر (بالإنجليزية: Weaver syndrome ) تعرف أيضاً بمتلازمة ويفر سميث وهي مرض خلقي شديد الندرة مُصاحَب بنمو سريع يبدأ في مرحلة ما قبل الولادة ويستمر خلال سنوات الطفولة والشباب…

Village in Kerala, IndiaKuttampuzhavillageKuttampuzhaLocation in Kerala, IndiaShow map of KeralaKuttampuzhaKuttampuzha (India)Show map of IndiaCoordinates: 10°9′0″N 76°44′0″E / 10.15000°N 76.73333°E / 10.15000; 76.73333Country IndiaStateKeralaDistrictErnakulamTalukKothamangalamGovernment • TypePanchayati Raj (India) • BodyKuttampuzha Grama PanchayatArea • Total448.05 km2 (172.99 sq mi) • Rank1…

2015 Islamist terrorist attack in Paris For broader coverage of this topic, see January 2015 Île-de-France attacks. Hypercacher kosher supermarket siegePart of the January 2015 Île-de-France attacksFlowers and a French flag outside the Hypercacher kosher supermarketLocationHypercacher kosher supermarket in Porte de Vincennes, Paris, FranceCoordinates48°50′49″N 2°24′55″E / 48.846963°N 2.415386°E / 48.846963; 2.415386Date9 January 2015; 8 years ag…

Cricket competition in Nepal This article needs additional citations for verification. Please help improve this article by adding citations to reliable sources. Unsourced material may be challenged and removed.Find sources: Dhangadhi Premier League – news · newspapers · books · scholar · JSTOR (June 2017) (Learn how and when to remove this template message) Dhangadhi Premier League (DPL) Official logo of DPL 3.CountriesNepalAdministratorDhangadhi Cricket …

Untuk kapal lain dengan nama serupa, lihat Kapal Jepang Arashio. Arashio pada 21 Desember 1937 Sejarah Kekaisaran Jepang Nama ArashioDipesan 1934 (tahun fiskal)Pembangun Galangan Kapal KawasakiPasang lunas 1 Oktober 1935Diluncurkan 26 Mei 1937Mulai berlayar 30 Desember 1937Dicoret 1 April 1943Nasib Tenggelam pada 4 Maret 1943 Ciri-ciri umum Kelas dan jenis Kapal perusak kelas-AsashioBerat benaman 2.370 ton panjang (2.408 t)Panjang 111 m (364 ft) (perpendikuler) 115 m (377…

SimaburNagariKampung Dakak Dakak Nagari SimaburNegara IndonesiaProvinsiSumatera BaratKabupatenTanah DatarKecamatanParianganKode Kemendagri13.04.09.2004 Luas945 HaJumlah penduduk2.880 jiwa Simabur merupakan salah satu nagari yang termasuk ke dalam wilayah kecamatan Pariangan, Kabupaten Tanah Datar, Provinsi Sumatera Barat, Indonesia. Nagari ini terletak di dekat Batusangkar, ibu kota dari kabupaten Tanah Datar. Sejarah Nagari Simabur adalah salah satu nagari yang ada di kecamatan Pariangan. …

Struktur kimia asam salisilat. Metil salisilat atau asam 2-hidroksi benzoat metil ester adalah sebuah senyawa organik yang mempunyai cincin aromatik.[1] Senyawa ini merupakan turunan metil ester dari asam salisilat.[2] Oleh karena itu, metil salisilat dapat diproduksi melalui reaksi kondensasi asam salisilat dan metanol.[2] Metil salisilat banyak digunakan dalam industri kosmetik sebagai agen penghangat.[2] Di dalam produk balsem, atau obat gosok, metil salisilat …

This article is about 1987 Hindi-language film. For 2019 Assamese-language film, see Pratighaat (2019 film). 1987 Indian filmPratighaatPosterDirected byN. ChandraScreenplay byT. KrishnaN. ChandraJalees (dialogue)Story byT. KrishnaBased onPratighatana by T. KrishnaProduced byRamoji RaoStarring Sujata Mehta Rohini Hattangadi Charan Raj Arvind Kumar CinematographyH. LaxminarayanEdited byN. ChandraMusic byRavindra JainProductioncompanyUshakiran MoviesRelease date 17 March 1987 (1987-0…

International real estate investment trust founded in the United States Simon Property Group, Inc.TypePublicTraded asNYSE: SPGS&P 100 componentS&P 500 componentISINUS8288062081US8288063071US8288066041US8288067031US8288068021US8288068773US8288068856 IndustryReal estate investment trustFounded1993; 30 years ago (1993)FoundersMelvin SimonHerbert SimonHeadquartersIndianapolis, Indiana, U.S.Areas servedWorldwideKey people David E. Simon(chairman & CEO) Herbert …

2014 family of single-board computers MIPS Creator redirects here. For other uses, see MIPS (disambiguation). Creator Ci20Release dateNovember 2014 (2014-11) (Updated May 2015 (2015-05))Introductory price£50, US$65Operating systemDebian 7, Android KitKat 4.4CPUDual-core 1.2 GHz MIPS32 processor, 32k I&D L1 cache, 512k L2 cacheMemory1 GB DDR3Storage8 GB FlashGraphicsPowerVR SGX540 GPUWebsitewww.imgtec.com/creator Creator is a family of single-board computers developed by …

Public state university in Brazil Not to be confused with Federal University of São Paulo. São Paulo University redirects here. Not to be confused with São Paulo State University. This article contains academic boosterism which primarily serves to praise or promote the subject and may be a sign of a conflict of interest. Please improve this article by removing peacock terms, weasel words, and other promotional material. (July 2023) (Learn how and when to remove this template message) Universi…

Kembali kehalaman sebelumnya

Lokasi Pengunjung: 44.192.49.72