Data archaeology

There are two conceptualisations of data archaeology, the technical definition and the social science definition.

Data archaeology (also data archeology) in the technical sense refers to the art and science of recovering computer data encoded and/or encrypted in now obsolete media or formats. Data archaeology can also refer to recovering information from damaged electronic formats after natural disasters or human error.

It entails the rescue and recovery of old data trapped in outdated, archaic or obsolete storage formats such as floppy disks, magnetic tape, punch cards and transforming/transferring that data to more usable formats.

Data archaeology in the social sciences usually involves an investigation into the source and history of datasets and the construction of these datasets. It involves mapping out the entire lineage of data, its nature and characteristics, its quality and veracity and how these affect the analysis and interpretation of the dataset.

The findings of performing data archaeology affect the level to which the conclusions parsed from data analysis can be trusted.[1]

The term data archaeology originally appeared in 1993 as part of the Global Oceanographic Data Archaeology and Rescue Project (GODAR). The original impetus for data archaeology came from the need to recover computerised records of climatic conditions stored on old computer tape, which can provide valuable evidence for testing theories of climate change. These approaches allowed the reconstruction of an image of the Arctic that had been captured by the Nimbus 2 satellite on September 23, 1966, in higher resolution than ever seen before from this type of data.[2]

NASA also utilises the services of data archaeologists to recover information stored on 1960s-era vintage computer tape, as exemplified by the Lunar Orbiter Image Recovery Project (LOIRP).[3]

Recovery

There is a distinction between data recovery and data intelligibility. One may be able to recover data but not understand it. For data archaeology to be effective, the data must be intelligible.[4]

A term closely related to data archaeology is data lineage. The first step in performing data archaeology is an investigation into their data lineage. Data lineage entails the history of the data, its source and any alterations or transformations they have undergone. Data lineage can be found in the metadata of a dataset, the para data of a dataset or any accompanying identifiers (methodological guides etc). With data archaeology comes methodological transparency which is the level to which the data user can access the data history. The level of methodological transparency available determines not only how much can be recovered, but assists in knowing the data. Data lineage investigation involves what instruments were used, what the selection criteria are, the measurement parameters and the sampling frameworks.[1]

In the socio-political manner, data archaeology involves the analysis of data assemblages to reveal their discursive and material socio-technical elements and apparatuses. This kind of analysis can reveal the politics of the data being analysed and thus that of their producing institution. Archaeology in this sense, refers to the provenance of data. It involves mapping the sites, formats and infrastructures through which data flows and are altered or transformed over time. it has an interest in the life of data, and the politics that shapes the circulation of data. This serves to expose the key actors, practices and praxes at play and their roles. It can be accomplished in two steps. First is, accessing and assessing the technical stack of the data (this refers to the infrastructure and material technologies used to build/gather the data) to understand the physical representation of the data and also. Second, analysing the contextual stack of the data which shapes how the data is constructed, used and analysed. This can be done via a variety of processes, interviews, analysing technical and policy documents and investigating the effect of the data on a community or the institutional, financial, legal and material framing. This can be attained by creating a data assemblage [1]

Data archaeology charts the way data moves across different sites and can sometimes encounter data friction.[5]

Disaster recovery

Data archaeologists can also use data recovery after natural disasters such as fires, floods, earthquakes, or even hurricanes. For example, in 1995 during Hurricane Marilyn the National Media Lab assisted the National Archives and Records Administration in recovering data at risk due to damaged equipment. The hardware was damaged from rain, salt water, and sand, yet it was possible to clean some of the disks and refit them with new cases thus saving the data within.[4]

Recovery techniques

Two floppy disks on a desk
Data stored in outdated formats like the floppy disk have to be restored to newer formats

When deciding whether or not to try and recover data, the cost must be taken into account. If there is enough time and money, most data will be able to be recovered. In the case of magnetic media, which are the most common type used for data storage, there are various techniques that can be used to recover the data depending on the type of damage.[4]: 17 

Humidity can cause tapes to become unusable as they begin to deteriorate and become sticky. In this case, a heat treatment can be applied to fix this problem, by causing the oils and residues to either be reabsorbed into the tape or evaporate off the surface of the tape. However, this should only be done in order to provide access to the data so it can be extracted and copied to a medium that is more stable.[4]: 17–18 

Lubrication loss is another source of damage to tapes. This is most commonly caused by heavy use, but can also be a result of improper storage or natural evaporation. As a result of heavy use, some of the lubricant can remain on the read-write heads which then collect dust and particles. This can cause damage to the tape. Loss of lubrication can be addressed by re-lubricating the tapes. This should be done cautiously, as excessive re-lubrication can cause tape slippage, which in turn can lead to media being misread and the loss of data.[4]: 18 

Water exposure will damage tapes over time. This often occurs in a disaster situation. If the media is in salty or dirty water, it should be rinsed in fresh water. The process of cleaning, rinsing, and drying wet tapes should be done at room temperature in order to prevent heat damage. Older tapes should be recovered prior to newer tapes, as they are more susceptible to water damage.[4]: 18 

The next step (after investigating the data lineage) is to establish what counts as good data and bad data to ensure that only the 'good' data gets migrated to the new data warehouse or repository. A good example of bad data is 'test data' in the technical data sense is test data.

Prevention

To prevent the need of data archaeology, creators and holders of digital documents should take care to employ digital preservation.

Servers in a rack
Storing data in an off shore server is a good preventive measure against data loss

Another effective preventive measure is the use of offshore backup facilities that could not be affected should a disaster occur. From these backup servers, copies of the lost data could easily be retrieved. A multi-site and multi-technique data distribution plan is advised for optimal data recovery, especially when dealing with big data. TCP/IP method, snapshot recovery, mirror sites and tapes safeguarding data in a private cloud are also all good preventive methods. Daily transferring data from their mirror sites to the emergency servers.[6]

See also

References

  1. ^ a b c Kitchin, Rob (2022). The Data Revolution. Sage.
  2. ^ Techno-archaeology rescues climate data from early satellites Archived 2010-11-26 at the Wayback Machine U.S. National Snow and Ice Data Center (NSIDC), January 2010
  3. ^ LOIRP Overview NASA website November 14, 2008 Archived
  4. ^ a b c d e f [1] Study on website October 23, 2011
  5. ^ Bates, Jo (2016). "Data Journeys: Capturing the socio-material constitution of data objects and flows". Big Data and Society. 3 (2): 1–12. doi:10.1177/2053951716654502. S2CID 54719310.
  6. ^ Chang, V (2015). "Towards a Big Data system disaster recovery in a Private Cloud" (PDF). Ad Hoc Networks. 5: 65–82. doi:10.1016/j.adhoc.2015.07.012. S2CID 18230189 – via Elsevier.

Further reading

  • O'Donnell, James Joseph. Avatars of the Word: From Papyrus to Cyperspace Harvard University Press, 1998.
  • Ross, Seamus & Gow, Ann (1999). Digital archaeology : rescuing neglected and damaged data resources (PDF). Electronic libraries programme studies. London & Bristol: British Library and Joint Information Systems Committee. ISBN 1-90050-851-6.
  • Dumit, J. and Nafus, D. (2018) ‘The other ninety per cent: Thinking with data science, creating data studies,’ in Knox, H. and Nafus, D. (eds), Ethnography for a Data-Saturated World. Manchester University Press, Manchester, pp. 252–274

Read other articles:

JägermeisterKarakteristikJenismerek, pemerekan dan Likeur AsalJerman Komposisianise seed (en) dan jahe Diperkenalkan1935 ProdusenMast-Jägermeister SE (en) Kadar alkohol35 vol% Warnahitam Situs webhttps://www.jagermeister.com [sunting di Wikidata]lbs Jägermeister (bahasa Jerman untuk pemburu profesional) adalah sejenis liqueur dengan kandungan alkohol 35% dengan rasa herbal. Minuman ini dibuat menggunakan 56 sari tumbuh-tumbuhan. Dari 56 sari tumbuhan yang dikandungnya, 50 dik...

 

 

Charles de LorraineNama dalam bahasa asli(fr) Charles de Lorraine BiografiKelahiran17 Februari 1524 Joinville Kematian26 Desember 1574 (50 tahun)Avignon Tempat pemakamanKatedral Reims   Uskup diosesan 18 Mei 1550 – 22 April 1551 ← Jean III de Lorraine Keuskupan: Keuskupan Metz   Roman Catholic Bishop of Metz (en) 1550 – 1551 ← Jean III de Lorraine – Robert de Lenoncourt (en) →   Kardinal 1547 �...

 

 

Capital of Lower Saxony, Germany This article is about the German city. For other uses, see Hanover (disambiguation). Hannover redirects here. For other uses, see Hannover (disambiguation). City in Lower Saxony, GermanyHanover Hannover (German) Hannober (Low German)CityClockwise from top: View over the city centre, Market Church of Saints George and James, New Town Hall, University of Hannover, Herrenhausen Gardens, Old Town Hall FlagCoat of armsLocation of Hanover within Hanover Re...

Universitas Al Washliyah (UNIVA) MedanJenisPerguruan Tinggi SwastaDidirikan18 Mei 1958 (Diresmikan) 18 Mei 1958 (Hari Jadi/Dies Natalies)[1]AfiliasiIslamRektorAssoc. Prof. Dr. H. M. Jamil, M.A.LokasiJl. Sisingamangaraja No.10, Medan Amplas, Medan, Sumatera Utara, IndonesiaNama julukanUNIVA MEDANSitus webwww.univamedan.ac.id Universitas Al Washliyah (UNIVA) Medan, Sumatera Utara, merupakan salah satu perguruan tinggi swasta di Sumatera Utara yang didirikan pada tanggal 18 Mei 1958. Pad...

 

 

Security agency in Ukraine Security Service of UkraineСлужба безпеки УкраїниEmblem of the Security ServiceFlag of the Security ServiceAgency overviewFormed20 September 1991; 32 years ago (1991-09-20)JurisdictionGovernment of UkraineHeadquarters32–35, Volodymyrska Street, Kyiv, 01034[1]Employees29,000 (November 2017)[2]30,000 (February 2014)[3]Agency executiveVasyl Malyuk[4], Head of the Security Service of UkraineParent a...

 

 

У этого термина существуют и другие значения, см. Горячая точка. Положение на карте Строение Гавайской горячей точки Гавайская горячая точка (Гавайское горячее пятно) — вулканическая горячая точка, расположенная вблизи острова Гавайи, в северной части Тихого океана. �...

Artikel ini tidak memiliki referensi atau sumber tepercaya sehingga isinya tidak bisa dipastikan. Tolong bantu perbaiki artikel ini dengan menambahkan referensi yang layak. Tulisan tanpa sumber dapat dipertanyakan dan dihapus sewaktu-waktu.Cari sumber: 1 angka – berita · surat kabar · buku · cendekiawan · JSTOR Untuk kegunaan lain, lihat 1. ← 0 1 2 → −1 0 1 2 3 4 5 6 7 8 9 → Daftar angka — Bilangan bulat ← 0 10 20 30 40 50 60 7...

 

 

Skadron Pendidikan 401 Semaba WaraDibentuk-NegaraIndonesiaCabang TNI Angkatan UdaraTipe unitKomando PendidikanBagian dariWing Pendidikan 400/Matukjur Lanud Adi SumarmoJulukanSkadik 401 Semaba WaraMotoVidya Kanya SenaSitus webwww.lanud-adisutjipto.mil.id Skadron Pendidikan 401 Semaba PK Wara Sebelumnya bernama Skadron Pendidikan 105 Semaba PK Wara atau (Skadik 105 Wara) Sekolah Pertama Bintara Prajurit Karier Wanita Angkatan Udara, adalah unsur pelaksana Wing Pendidikan 400 Lanud Adi Sumarmo s...

 

 

Синелобый амазон Научная классификация Домен:ЭукариотыЦарство:ЖивотныеПодцарство:ЭуметазоиБез ранга:Двусторонне-симметричныеБез ранга:ВторичноротыеТип:ХордовыеПодтип:ПозвоночныеИнфратип:ЧелюстноротыеНадкласс:ЧетвероногиеКлада:АмниотыКлада:ЗавропсидыКласс:Пт�...

Questa voce sull'argomento calciatori brasiliani è solo un abbozzo. Contribuisci a migliorarla secondo le convenzioni di Wikipedia. Segui i suggerimenti del progetto di riferimento. Egídio Nazionalità  Brasile Altezza 175 cm Peso 69 kg Calcio Ruolo Difensore Squadra svincolato Carriera Giovanili 2003 Flamengo Squadre di club1 2004-2006 Flamengo1 (0)2007→  Paraná0 (0)2007-2008 Flamengo7 (0)2008→  Juventude14 (0)2008-2009 Flamengo0 (0)[1]...

 

 

Fortified vault building in Fort Knox, Kentucky United States historic placeUnited States Bullion DepositoryFort Knox, KentuckyU.S. National Register of Historic Places The United States Bullion DepositoryShow map of KentuckyShow map of the United StatesLocationGold Vault Rd. and Bullion Blvd.Fort Knox, KentuckyCoordinates37°53′00″N 85°57′55″W / 37.8833°N 85.9653°W / 37.8833; -85.9653Area42 acres (17 ha)Built1936Built byGreat Lakes ConstructionArchitec...

 

 

International Financial Services Commission (IFSC)Agency overviewFormedJanuary 1, 1999 (1999-January-01)JurisdictionInternationalHeadquartersBelize City, BelizeAgency executivesJoseph Waight, ChairmanGlenford Ysaguirre, Chief executiveWebsitewww.ifsc.gov.bz The International Financial Services Commission (IFSC) is the Belize government agency responsible for financial regulation. It is responsible for regulating all financial market participants, exchanges and the setting and e...

Запрос «Пугачёва» перенаправляется сюда; см. также другие значения. Алла Пугачёва На фестивале «Славянский базар в Витебске», 2016 год Основная информация Полное имя Алла Борисовна Пугачёва Дата рождения 15 апреля 1949(1949-04-15) (75 лет) Место рождения Москва, СССР[1]...

 

 

German centerfire rifle cartridge 5.6×50mm Magnum5.6×50mm Magnum dimensionsTypeRiflePlace of originWest GermanyProduction historyDesignerGünter FrèresDesigned1970SpecificationsParent case5.6×50mmR (Rimmed, 1968)Case typeRimless, bottleneckBullet diameter5.70 mm (0.224 in)Land diameter5.56 mm (0.219 in)Neck diameter6.48 mm (0.255 in)Shoulder diameter9.00 mm (0.354 in)Base diameter9.56 mm (0.376 ...

 

 

Census-designated place in Washington, United StatesSouthworth, WashingtonCensus-designated placeSouthworthLocation in Washington and the United StatesShow map of Washington (state)SouthworthSouthworth (the United States)Show map of the United StatesCoordinates: 47°30′44″N 122°30′02″W / 47.51222°N 122.50056°W / 47.51222; -122.50056CountryUnited StatesStateWashingtonCountyKitsapArea • Total3.39 sq mi (8.77 km2) • Land3...

Political party in Austria Fatherland Front Vaterländische FrontFederal leaderEngelbert Dollfuß(20 May 1933 – 25 July 1934)Ernst Starhemberg(31 July 1934 – 15 May 1936)[1]Founded20 May 1933; 91 years ago (1933-05-20)Dissolved13 March 1938; 86 years ago (1938-03-13)Merger ofCS, Landbund, HeimwehrYouth wingÖsterreichisches Jungvolk[2]Paramilitary wingAssault Corps [de][3]Membership3,000,000 (1937 est.)[...

 

 

Packaging which results in improved sustainability Molded pulp uses recycled newsprint to form package components. Here, researchers are molding packaging from straw[1] Sustainable packaging is the development and use of packaging which results in improved sustainability.[2] This involves increased use of life cycle inventory (LCI) and life cycle assessment (LCA)[3][4] to help guide the use of packaging which reduces the environmental impact and ecological foot...

 

 

Albanian Orthodox Diocese of AmericaDioqeza ortodokse shqiptare në AmerikëBishop Ilia (Katre), in 2008.LocationTerritoryUnited States, CanadaHeadquartersLas Vegas, NevadaInformationDenominationEastern OrthodoxSui iuris churchEcumenical PatriarchateRiteByzantineEstablished1949Boston, USA by Athenagoras (Spyrou)LanguageAlbanian, EnglishMusicByzantine chantWebsiteAlbanian Orthodox Diocese in America, at the website of the GOA This article forms part of the seriesEastern Orthodox Christianityin...

Даты в статье указаны по новому стилю. В белой России счёт продолжался по старому стилю Гражданская война в РоссииОсновной конфликт: Первая мировая война(до 1918 года) Сверху вниз, слева направо: Вооружённые силы Юга России в 1919 году, повешение австро-венгерскими войсками р...

 

 

Food producing method For the make-up technique, see Baking (make-up). Baked redirects here. For cannabis intoxication, see Effects of cannabis. For the comedy web series, see Baked (web series). Freshly baked bread Anders Zorn – Bread baking (1889) Baking is a method of preparing food that uses dry heat, typically in an oven, but can also be done in hot ashes, or on hot stones. The most common baked item is bread, but many other types of foods can be baked.[1] Heat is gradually tra...