Share to: share facebook share twitter share wa share telegram print page

Stata

Stata
Original author(s)William Gould[1]
Developer(s)StataCorp
Initial release1985 (1985)
Stable release
18.0 / April 25, 2023; 13 months ago (2023-04-25)
Written inC
Operating systemWindows, macOS, Linux
TypeStatistical analysis
Numerical analysis
LicenseProprietary
Websitewww.stata.com

Stata (/ˈsttə/,[2] STAY-ta, alternatively /ˈstætə/, occasionally stylized as STATA[3][4]) is a general-purpose statistical software package developed by StataCorp for data manipulation, visualization, statistics, and automated reporting. It is used by researchers in many fields, including biomedicine, economics, epidemiology, and sociology.[5]

Stata was initially developed by Computing Resource Center in California and the first version was released in 1985.[6] In 1993, the company moved to College Station, Texas and was renamed Stata Corporation, now known as StataCorp.[1] A major release in 2003 included a new graphics system and dialog boxes for all commands.[6] Since then, a new version has been released once every two years.[7] The current version is Stata 18, released in April 2023.[8]

Technical overview and terminology

User interface

From its creation, Stata has always employed an integrated command-line interface. Starting with version 8.0, Stata has included a graphical user interface which uses menus and dialog boxes to give access to many built-in commands. The dataset can be viewed or edited in spreadsheet format. From version 11 on, other commands can be executed while the data browser or editor is opened.

Data structure and storage

Until the release of version 16,[9] Stata could only open a single dataset at any one time. Stata allows for flexibility with assigning data types to data. Its compress command automatically reassigns data to data types that take up less memory without loss of information. Stata utilizes integer storage types which occupy only one or two bytes rather than four, and single-precision (4 bytes) rather than double-precision (8 bytes) is the default for floating-point numbers.

Stata's data format is always tabular in format. Stata refers to the columns of tabular data as variables.

Data format compatibility

Stata can import data in a variety of formats. This includes ASCII data formats (such as CSV or databank formats) and spreadsheet formats (including various Excel formats).

Stata's proprietary file formats have changed over time, although not every Stata release includes a new dataset format. Every version of Stata can read all older dataset formats, and can write both the current and most recent previous dataset format, using the saveold command.[10] Thus, the current Stata release can always open datasets that were created with older versions, but older versions cannot read newer format datasets.

Stata can read and write SAS XPORT format datasets natively, using the fdause and fdasave commands.

Some other econometric applications, including gretl, can directly import Stata file formats.

History

Origins

The development of Stata began in 1984, initially by William (Bill) Gould and later by Sean Becketti. The software was originally intended to compete with statistical programs for personal computers such as SYSTAT and MicroTSP.[6] Stata was written, then as now, in the C programming language, initially for PCs running the DOS operating system. The first version was released in 1985 with 44 commands.[6]

Commands in Stata 1.0 and Stata 1.1
append dir infile plot spool
beep do input query summarize
by drop label regress tabulate
capture erase list rename test
confirm exit macro replace type
convert expand merge run use
correlate format modify save
count generate more set
describe help outfile sort

Development

There have been 17 major releases of Stata between 1985 and 2021, and additional code and documentation updates between major releases.[7] In its early years, extra sets of Stata programs were sometimes sold as "kits" or distributed as Support Disks. With the release of Stata 6 in 1999, updates began to be delivered to users via the web.[6] The initial release of Stata was for the DOS operating system. Since then, versions of Stata have been released for systems running Unix variants like Linux distributions, Windows, and MacOS.[6] All Stata files are platform-independent.

Hundreds of commands have been added to Stata in its 37-year history.[11][12] Certain developments have proved to be particularly important and continue to shape the user experience today, including extensibility, platform independence, and the active user community.[6]

Extensibility

The program command was implemented in Stata 1.2, giving users the ability to add their own commands.[6][13] ado-files followed in Stata 2.1, allowing a user-written program to be automatically loaded into memory. Many user-written ado-files are submitted to the Statistical Software Components Archive hosted by Boston College. StataCorp added an ssc command to allow community-contributed programs to be added directly within Stata.[14] More recent editions of Stata allow users to call Python scripts using commands, as well as allowing Python IDEs like Jupyter Notebooks to import Stata commands.[15] Although Stata does not support R natively, there are user-written extensions to use R scripts in Stata.[16]

User community

A number of important developments were initiated by Stata's active user community.[6] The Stata Technical Bulletin, which often contains user-created commands, was introduced in 1991 and issued six times a year. It was relaunched in 2001 as the peer-reviewed Stata Journal, a quarterly publication containing descriptions of community-contributed commands and tips for the effective use of Stata. In 1994, a listserv began as a hub for users to collaboratively solve coding and technical issues; in 2014, it was converted into a web forum. In 1995, Statacorp began organizing user and developer conferences that meet annually. Only the annual Stata Conference held in the United States is hosted by StataCorp. Other user group meetings are held annually in the United States (the Stata Conference), the UK, Germany, and Italy, and less frequently in several other countries. Local Stata distributors host User Group meetings in their own countries.

Table: Releases and Development of Stata
Version Release date Select new or enhanced features
1.0 January 1985
  • Initial release
  • Forty-four commands
1.1 February 1985
  • Bug fixes
1.2 May 1985
  • New menu system
  • Better online help
  • keep
1.3 August 1985
  • Stata/Graphics
  • program
1.4 August 1986
  • New documentation
  • Formatted infile
1.5 February 1987
  • anova
  • logit, probit
2.0 June 1988
  • New graphics
  • String variables
  • Survival analysis: Cox and Kaplan-Meier
  • Stepwise regression
2.1 September 1990
  • Byte variables
  • Factor analysis
  • ado-files
  • reshape
3.0 March 1992
  • logistic, ologit, oprobit, clogit, mlogit
  • tobit, cnreg, rreg, qreg, weibull, ereg
  • epitab
  • pweights
3.1 August 1993
  • mvreg, sureg, heckman, nlreg, areg, canon
  • nbreg
  • constrained linear regression
  • ml
  • codebook
4.0 January 1995
  • xtreg
  • glm
5.0 October 1996
  • xtgee, xtprobit
  • prais, newey, intreg
  • survey estimation commands
  • fracpoly
  • st extended
6.0 January 1999
  • web aware
  • new ml
  • time-series operators
  • arima, arch
  • st rewritten
7.0 December 2000
  • frailty
  • xtabond
  • cluster analysis
  • nlogit
  • roc
  • SMCL
8.0 January 2003
  • graphics
  • extended GUI, dialog boxes available for all commands
  • manova
  • more survey
  • more time series (VARs, SVARs)
  • more GLLAMM internalization
8.1 July 2003
  • updated ml
8.2 October 2003
  • graphics changes
9.0 April 2005
  • mata matrix programming language
  • survey features
  • linear mixed models
  • multinominal probit models
9.1 September 2005
9.2 April 2006
10.0 June 2007
  • graph editor
  • logistic and Poisson models with complex, nested error components
10.1 August 2008
11.0 July 2009
  • factor variables
  • margins postestimation command
  • multiple imputation
11.1 June 2010
11.2 March 2011
12.0 July 2011
  • automatic memory management
  • structural equation modeling
12.1 January 2012
13.0 June 2013
  • long strings
  • treatment effects
13.1 October 2013
14.0 April 2015
  • unicode support
  • Bayesian statistical analysis
14.1 October 2015
14.2 September 2016
15.0 June 2017
  • latent class analysis
  • PDF and Word documents
  • color transparency or opacity in graphs
15.1 November 2017
16.0 June 2019
  • frames (multiple datasets in memory)
  • lasso regression
  • automated reporting
  • updated choice models
16.1 February 2020
17.0 April 2021
  • updated tables command
  • bayesian econometrics
18.0 April 2023
  • Bayesian model averaging
  • causal mediation analysis
  • heterogeneous difference-in-differences

Software products

There are four builds of Stata: Stata/MP, Stata/SE, Stata/BE, and Numerics by Stata.[17] Whereas Stata/MP allows for built-in parallel processing of certain commands, Stata/SE and Stata/BE are bottlenecked and limit usage to only one single core.[18] Stata/MP runs certain commands about 2.4 times faster, roughly 60% of theoretical maximum efficiency, when running parallel processes on four CPU cores compared to SE or BE versions.[18] Numerics by Stata allows for web integration of Stata commands.

SE and BE versions differ in the amount of memory datasets may utilize. Though Stata/MP can store 10 to 20 billion observations and up to 120,000 variables, Stata/SE and Stata/BE store up to 2.14 billion observations and handle 32,767 variables and 2,048 variables respectively. The maximum number of independent variables in a model is 65,532 variables in Stata/MP, 10,998 variables in Stata/SE, and 798 variables in Stata/BE.[17]

The pricing and licensing of Stata depends on its intended use: business, government/nonprofit, education, or student. Single user licenses are either renewable annually or perpetual. Other license types include a single license for use by concurrent users, a site license, volume single user for bulk pricing, or a student lab.[19]

Example code

The following set of commands revolve around simple data management.[20]

sysuse auto                 // Open the included auto dataset
browse                      // Browse the dataset (opens the Data Editor window)

describe                    // Describes the dataset and associated variables
summarize                   // Summary information about numerical variables

codebook make foreign       // Summary information about the make (string) and foreign (numeric) variables

browse if missing(rep78)    // Browse only observations with missing data for variable rep78
list make if missing(rep78) // List makes of the cars with missing data for variable rep78

The next set of commands move onto descriptive statistics.

summarize price, detail          // Detailed summary statistics for variable price

tabulate foreign                 // One-way frequency table for variable foreign
tabulate rep78 foreign, row      // Two-way frequency table for variables rep78 and foreign

summarize mpg if foreign == 1    // Summary information about mpg if the car is foreign (the "==" sign tests for equality)
by foreign, sort: summarize mpg  // As above, but using the "by" prefix.
tabulate foreign, summarize(mpg) // As above, but using the tabulate command.

A simple hypothesis test:

ttest mpg, by(foreign) // T-test for difference in means for domestic vs. foreign cars

Graphing data:

twoway (scatter mpg weight)                     // Scatter plot showing relationship between mpg and weight
twoway (scatter mpg weight), by(foreign, total) // Three graphs for domestic, foreign, and all cars

Linear regression:

generate wtsq = weight^2                      // Create a new variable for weight squared
regress mpg weight wtsq foreign, vce(robust)  // Linear regression of mpg on weight, wtsq, and foreign
predict mpghat                                // Create a new variable contained the predicted values of mpg
twoway (scatter mpg weight) (line mpghat weight, sort), by(foreign) // Graph data and fitted line
Regression graphs from auto dataset in Stata 17

See also

References

  1. ^ a b Newton, H. Joseph (2005). "A conversation with William Gould". The Stata Journal. 5 (1): 19–31. doi:10.1177/1536867X0500500103. S2CID 118322998.
  2. ^ Cox, Nicholas J. "Statalist FAQ". Statalist: The Stata Forum. Retrieved 24 April 2021.
  3. ^ "STATA Data Manipulation: Basics and Applications 7" (PDF). Iuj.ac.jp. Retrieved 27 January 2022.
  4. ^ Suárez, Erick; Pérez, Cynthia; Nogueras, Graciela; Moreno-Gorrín, Camille (2016). biostatistics-in-public-health-using-stata.
  5. ^ "Disciplines". Stata: Software for Statistics and Data Science. Retrieved 2021-04-21.
  6. ^ a b c d e f g h i Cox, Nicholas J. (2005). "A brief history of Stata on its 20th anniversary". The Stata Journal. 5 (1): 2–18. doi:10.1177/1536867X0500500102. S2CID 118366843. Retrieved 22 April 2021.
  7. ^ a b Gould, William W.; Cox, Nicholas J. "When was Stata first released? When were later versions released?". Stata: Software for Statistics and Data Science. Retrieved 22 April 2021.
  8. ^ "What's new in Stata?". Stata: Software for Statistics and Data Science. StataCorp. Retrieved 25 April 2023.
  9. ^ "Data frames: multiple datasets in memory". Stata.com. Retrieved 2020-08-13.
  10. ^ "Stata 16 help for save". Stata.com.
  11. ^ Stata Glossary and Index: Release 17 (PDF). College Station, TX: Stata Press. pp. 1–50. ISBN 1-59718-283-4.
  12. ^ "Stata features". Stata: Software for Statistics and Data Science. StataCorp. Retrieved 24 April 2021.
  13. ^ "program - Define and manipulate programs" (PDF). Stata: Software for Statistics and Data Science. Stata Press. Retrieved 24 April 2021.
  14. ^ "ssc - Install and uninstall packages from SSC" (PDF). Stata: Software for Statistics and Data Science. Stata Press. Retrieved 24 April 2021.
  15. ^ "Use Python and Stata together | Stata".
  16. ^ "How to Switch Your Workflow from Stata to R, One Bit at a Time · Frederick Solt". Fsolt.org. Retrieved 27 January 2022.
  17. ^ a b "Which Stata is right for me?". Stata: Software for Statistics and Data Science. Retrieved 23 April 2021.
  18. ^ a b "Parallel Stata". Harvard Business School.
  19. ^ "Order Stata software". Stata: Software for Statistics and Data Science. StataCorp. Retrieved 25 April 2021.
  20. ^ Getting Started with Stata for Windows (PDF) (Release 17 ed.). College Station, TX: Stata Press. pp. 1–19. ISBN 1-59718-334-2. Retrieved 25 April 2021.

Further reading

External links

Read other articles:

Иллюстрированная грампластинка с записью альбома Tubular Bells Майка Олдфилда Иллюстрированная грампластинка (англ. Picture disc [пикче-диск]), иллюстрированный винил или пластинка с вшитым изображением — граммофонная (фонографическая) пластинка, через прозрачную поверхность…

Quaker Nobel nominees The Quaker Peace Star has been used in a variety of forms ever since, representing Quaker work for peace and the relief of suffering caused by war. In 1947, the American Friends Service Committee (AFSC) and the Quaker Peace and Social Witness (QPSW) (previously known as the Friends Service Council) jointly received the Nobel Peace Prize on behalf of all Quakers around the world for their pioneering work in the international peace movement and compassionate effort to relieve…

يفتقر محتوى هذه المقالة إلى الاستشهاد بمصادر. فضلاً، ساهم في تطوير هذه المقالة من خلال إضافة مصادر موثوق بها. أي معلومات غير موثقة يمكن التشكيك بها وإزالتها. (يوليو 2020) الدوري الإيطالي الدرجة الثانية 1951–52 تفاصيل الموسم الدوري الإيطالي الدرجة الثانية  النسخة 20  البلد إي

Type of Israeli boarding school Havat HaNoar HaTzioni, Jerusalem A youth village (Hebrew: כפר נוער, Kfar No'ar) is a boarding school model first developed in Mandatory Palestine in the 1930s to care for groups of children and teenagers fleeing the Nazis. Henrietta Szold and Recha Freier were the pioneers in this sphere, known as youth aliyah, creating an educational facility that was a cross between a European boarding school and a kibbutz. History Ben Shemen youth village, 1920s-1930s Th…

101–106 AD pair of Roman wars against Dacia This article needs additional citations for verification. Please help improve this article by adding citations to reliable sources. Unsourced material may be challenged and removed.Find sources: Trajan's Dacian Wars – news · newspapers · books · scholar · JSTOR (February 2012) (Learn how and when to remove this template message) Trajan's Dacian WarsPart of the Dacian WarsDate101–102 and 105–106LocationAnci…

Halaman pertama Plakkaat van Verlatinghe. Plakkaat van Verlatinghe yang ditandatangani pada tanggal 26 Juli 1581 di Den Haag adalah sebuah piagam yang memastikan keputusan yang dibuat oleh Dewan Negara Belanda di Antwerpen pada tanggal 22 Juli 1581. Piagam ini menyatakan bahwa semua provinsi yang bersatu dalam Uni Utrecht mencabut sumpah kesetiaan mereka kepada Philip II dari Spanyol. Dengan menindas dan melanggar hak bawahan-bawahannya, Philip dianggap telah mengosongkan tahtanya di Negara-Nega…

Szulimán   Município   Símbolos Brasão de armas Localização SzulimánLocalização de Szulimán na Hungria Coordenadas 46° 07' 32 N 17° 48' 20 E País Hungria Condado Baranya Características geográficas Área total 10,49 km² População total (2019) 217 ​​​​ hab. Código postal 7932 Szulimán é um município da Hungria, situado no condado de Baranya. Tem 10,49 km² de área e sua população em 2019 foi estimada em 217 habitantes.…

This article may contain an excessive amount of intricate detail that may interest only a particular audience. Please help by spinning off or relocating any relevant information, and removing excessive detail that may be against Wikipedia's inclusion policy. (July 2021) (Learn how and when to remove this template message) LGBT rights in the PhilippinesPhilippinesStatusLegalGender identityTransgender people are not allowed to change legal genderMilitaryGay, lesbian, bisexual and transgender peopl…

Lei'D Tapa Perez Tapa en noviembre de 2012Nombre Selena Perez De Tonga[1]​Nacimiento Alemania[1]​27 de noviembre de 1982 (41 años)[2]​Alma máter Missouri State UniversityNombres artísticos Lei'D TapaPeso 100 kg (220 lb)[3]​Estatura 1,80 m (5′ 11″)[3]​Nacionalidad Estados UnidosResidencia Louisville, Kentucky, Estados UnidosResidencia artística TongaEntrenador The BarbarianGeorge South[1]​[4]​OVWEstadísticasDebut 2011[4]​…

Artikel ini sebatang kara, artinya tidak ada artikel lain yang memiliki pranala balik ke halaman ini.Bantulah menambah pranala ke artikel ini dari artikel yang berhubungan atau coba peralatan pencari pranala.Tag ini diberikan pada Februari 2023. Artikel ini membutuhkan rujukan tambahan agar kualitasnya dapat dipastikan. Mohon bantu kami mengembangkan artikel ini dengan cara menambahkan rujukan ke sumber tepercaya. Pernyataan tak bersumber bisa saja dipertentangkan dan dihapus.Cari sumber: P…

Katalis Karstedt Penanda Nomor CAS 11057-89-9 Model 3D (JSmol) Gambar interaktif 3DMet {{{3DMet}}} ChemSpider 9135105 Nomor EC PubChem CID 10959889 Nomor RTECS {{{value}}} Nomor UN 1307 InChI InChI=1S/C8H18OSi2.Pt/c1-7-10(3,4)9-11(5,6)8-2;/h7-8H,1-2H2,3-6H3;Key: RCNRJBWHLARWRP-UHFFFAOYSA-N SMILES C[Si](C)(C=C)O[Si](C)(C)C=C.[Pt] Sifat Rumus kimia C24H54O3Pt2Si6 Massa molar 949,4 g/mol Penampilan padatan tak berwarna Densitas 1,74 g/cm3 Titik lebur 12 hingga 13 …

Untuk tim utama, lihat Persipura Jayapura. Persipura Jayapura U-21Nama lengkapPersatuan Sepak bola Indonesia Jayapura U-21JulukanMutiara MudaBerdiri1963StadionStadion MandalaJayapura, Indonesia(Kapasitas: 30.000)Pelatih Abdul ManafLigaLiga Super Indonesia U-212011Tempat ke-3 di Grup 3 Kostum kandang Kostum tandang Musim ini Persipura Jayapura U-21[1] adalah klub sepak bola professional yang berbasis di Jayapura, Papua, Indonesia. Tim ini dijuluki The Young Pearls atau Mutiara Muda. Persi…

Peta infrastruktur dan tata guna lahan di Komune Martigny-les-Bains.  = Kawasan perkotaan  = Lahan subur  = Padang rumput  = Lahan pertanaman campuran  = Hutan  = Vegetasi perdu  = Lahan basah  = Anak sungaiMartigny-les-Bains merupakan sebuah komune di departemen Vosges yang terletak pada sebelah timur laut Prancis. Lihat pula Komune di departemen Vosges Referensi INSEE lbsKomune di departemen Vosges Les Ableuvenettes Ahéville Aingeville Ainvelle Allarmon…

Highest court in the state of Queensland, Australia Supreme Court of QueenslandFaçade of the Queen Elizabeth II Courts of Law building in Brisbane—the main sitting location of the Court.27°28′4″S 153°1′14″E / 27.46778°S 153.02056°E / -27.46778; 153.02056Established7 August 1861 (1861-08-07)Jurisdiction QueenslandLocationBrisbaneCoordinates27°28′4″S 153°1′14″E / 27.46778°S 153.02056°E / -27.46778; 1…

Place in Lower Austria, AustriaSankt Leonhard am HornerwaldSankt Leonhard am Hornerwald Coat of armsSankt Leonhard am HornerwaldLocation within AustriaCoordinates: 48°35′N 15°32′E / 48.583°N 15.533°E / 48.583; 15.533CountryAustriaStateLower AustriaDistrictKrems-LandGovernment • MayorEva Schachinger, ÖVPArea[1] • Total51.6 km2 (19.9 sq mi)Elevation582 m (1,909 ft)Population (2018-01-01)[2] …

1987 concert tour by Madonna Who's That Girl World TourTour by MadonnaPromotional poster for the tourLocation Asia North America Europe Associated album True Blue Who's That Girl Start dateJune 14, 1987 (1987-06-14)End dateSeptember 6, 1987 (1987-09-06)Legs3No. of shows38Box officeUS$25 million[a]Madonna concert chronology The Virgin Tour (1985) Who's That Girl World Tour (1987) Blond Ambition World Tour (1990) The Who's That Girl World Tour (billed…

Neighbourhood in The Hague, Netherlands This article has multiple issues. Please help improve it or discuss these issues on the talk page. (Learn how and when to remove these template messages) This article needs additional citations for verification. Please help improve this article by adding citations to reliable sources. Unsourced material may be challenged and removed.Find sources: Belgisch Park – news · newspapers · books · scholar · JSTOR (July 2017…

Swedish marathon runner Henry PalméPersonal informationBorn4 September 1907Flädie, SwedenDied2 June 1987 (aged 79)Enskede, SwedenHeight1.69 m (5 ft 7 in)Weight57 kg (126 lb)SportSportAthleticsEventMarathonClubFredrikshofs IF, StockholmAchievements and titlesPersonal best2:36:56 (1939)[1][2] Medal record Representing  Sweden European Championships 1938 Paris Marathon Henry Artur Palmé (4 September 1907 – 2 June 1987) was a Swedish marathon runner…

Biara Kubah Emas Santo MikhaelКиєво-Михайлівський Золотоверхий чоловічий монастир50°27′20″N 30°31′22″E / 50.45556°N 30.52278°E / 50.45556; 30.52278Koordinat: 50°27′20″N 30°31′22″E / 50.45556°N 30.52278°E / 50.45556; 30.52278LokasiKyivNegaraUkraineDenominasiGereja Ortodoks UkrainaSitus webhttp://www.archangel.kiev.ua/SejarahDidirikan1108–1113PendiriSviatopolk II dari KievArsite…

Academy in Donington, Lincolnshire, EnglandCowley AcademyCowley High SchoolAddressSchool LaneDonington, Lincolnshire, PE11 4TFEnglandCoordinates52°54′18″N 0°11′57″W / 52.90506°N 0.19913°W / 52.90506; -0.19913InformationFormer nameThe Thomas Cowley High SchoolTypeAcademyEstablished1719; 304 years ago (1719)FounderThomas CowleyDepartment for Education URN149436 TablesOfstedReportsGenderMixedAge11 to 16Websitehttps://www.cowleyacademy.org.uk…

Kembali kehalaman sebelumnya

Lokasi Pengunjung: 3.128.226.195