Predictive modelling

Predictive modelling uses statistics to predict outcomes.[1] Most often the event one wants to predict is in the future, but predictive modelling can be applied to any type of unknown event, regardless of when it occurred. For example, predictive models are often used to detect crimes and identify suspects, after the crime has taken place.[2]

In many cases, the model is chosen on the basis of detection theory to try to guess the probability of an outcome given a set amount of input data, for example given an email determining how likely that it is spam.

Models can use one or more classifiers in trying to determine the probability of a set of data belonging to another set. For example, a model might be used to determine whether an email is spam or "ham" (non-spam).

Depending on definitional boundaries, predictive modelling is synonymous with, or largely overlapping with, the field of machine learning, as it is more commonly referred to in academic or research and development contexts. When deployed commercially, predictive modelling is often referred to as predictive analytics.

Predictive modelling is often contrasted with causal modelling/analysis. In the former, one may be entirely satisfied to make use of indicators of, or proxies for, the outcome of interest. In the latter, one seeks to determine true cause-and-effect relationships. This distinction has given rise to a burgeoning literature in the fields of research methods and statistics and to the common statement that "correlation does not imply causation".

Models

Nearly any statistical model can be used for prediction purposes. Broadly speaking, there are two classes of predictive models: parametric and non-parametric. A third class, semi-parametric models, includes features of both. Parametric models make "specific assumptions with regard to one or more of the population parameters that characterize the underlying distribution(s)".[3] Non-parametric models "typically involve fewer assumptions of structure and distributional form [than parametric models] but usually contain strong assumptions about independencies".[4]

Applications

Uplift modelling

Uplift modelling is a technique for modelling the change in probability caused by an action. Typically this is a marketing action such as an offer to buy a product, to use a product more or to re-sign a contract. For example, in a retention campaign you wish to predict the change in probability that a customer will remain a customer if they are contacted. A model of the change in probability allows the retention campaign to be targeted at those customers on whom the change in probability will be beneficial. This allows the retention programme to avoid triggering unnecessary churn or customer attrition without wasting money contacting people who would act anyway.

Archaeology

Predictive modelling in archaeology gets its foundations from Gordon Willey's mid-fifties work in the Virú Valley of Peru.[5] Complete, intensive surveys were performed then covariability between cultural remains and natural features such as slope and vegetation were determined. Development of quantitative methods and a greater availability of applicable data led to growth of the discipline in the 1960s and by the late 1980s, substantial progress had been made by major land managers worldwide.

Generally, predictive modelling in archaeology is establishing statistically valid causal or covariable relationships between natural proxies such as soil types, elevation, slope, vegetation, proximity to water, geology, geomorphology, etc., and the presence of archaeological features. Through analysis of these quantifiable attributes from land that has undergone archaeological survey, sometimes the "archaeological sensitivity" of unsurveyed areas can be anticipated based on the natural proxies in those areas. Large land managers in the United States, such as the Bureau of Land Management (BLM), the Department of Defense (DOD),[6][7] and numerous highway and parks agencies, have successfully employed this strategy. By using predictive modelling in their cultural resource management plans, they are capable of making more informed decisions when planning for activities that have the potential to require ground disturbance and subsequently affect archaeological sites.

Customer relationship management

Predictive modelling is used extensively in analytical customer relationship management and data mining to produce customer-level models that describe the likelihood that a customer will take a particular action. The actions are usually sales, marketing and customer retention related.

For example, a large consumer organization such as a mobile telecommunications operator will have a set of predictive models for product cross-sell, product deep-sell (or upselling) and churn. It is also now more common for such an organization to have a model of savability using an uplift model. This predicts the likelihood that a customer can be saved at the end of a contract period (the change in churn probability) as opposed to the standard churn prediction model.

Auto insurance

Predictive modelling is utilised in vehicle insurance to assign risk of incidents to policy holders from information obtained from policy holders. This is extensively employed in usage-based insurance solutions where predictive models utilise telemetry-based data to build a model of predictive risk for claim likelihood.[citation needed] Black-box auto insurance predictive models utilise GPS or accelerometer sensor input only.[citation needed] Some models include a wide range of predictive input beyond basic telemetry including advanced driving behaviour, independent crash records, road history, and user profiles to provide improved risk models.[citation needed]

Health care

In 2009 Parkland Health & Hospital System began analyzing electronic medical records in order to use predictive modeling to help identify patients at high risk of readmission. Initially, the hospital focused on patients with congestive heart failure, but the program has expanded to include patients with diabetes, acute myocardial infarction, and pneumonia.[8]

In 2018, Banerjee et al.[9] proposed a deep learning model for estimating short-term life expectancy (>3 months) of the patients by analyzing free-text clinical notes in the electronic medical record, while maintaining the temporal visit sequence. The model was trained on a large dataset (10,293 patients) and validated on a separated dataset (1818 patients). It achieved an area under the ROC (Receiver Operating Characteristic) curve of 0.89. To provide explain-ability, they developed an interactive graphical tool that may improve physician understanding of the basis for the model's predictions. The high accuracy and explain-ability of the PPES-Met model may enable the model to be used as a decision support tool to personalize metastatic cancer treatment and provide valuable assistance to physicians.

The first clinical prediction model reporting guidelines were published in 2015 (Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD)), and have since been updated.[10]

Predictive modelling has been used to estimate surgery duration.

Algorithmic trading

Predictive modeling in trading is a modeling process wherein the probability of an outcome is predicted using a set of predictor variables. Predictive models can be built for different assets like stocks, futures, currencies, commodities etc.[citation needed] Predictive modeling is still extensively used by trading firms to devise strategies and trade. It utilizes mathematically advanced software to evaluate indicators on price, volume, open interest and other historical data, to discover repeatable patterns.[11]

Lead tracking systems

Predictive modelling gives lead generators a head start by forecasting data-driven outcomes for each potential campaign. This method saves time and exposes potential blind spots to help client make smarter decisions.[12]

Notable failures of predictive modeling

Although not widely discussed by the mainstream predictive modeling community, predictive modeling is a methodology that has been widely used in the financial industry in the past and some of the major failures contributed to the financial crisis of 2007–2008. These failures exemplify the danger of relying exclusively on models that are essentially backward looking in nature. The following examples are by no mean a complete list:

  1. Bond rating. S&P, Moody's and Fitch quantify the probability of default of bonds with discrete variables called rating. The rating can take on discrete values from AAA down to D. The rating is a predictor of the risk of default based on a variety of variables associated with the borrower and historical macroeconomic data. The rating agencies failed with their ratings on the US$600 billion mortgage backed Collateralized Debt Obligation (CDO) market. Almost the entire AAA sector (and the super-AAA sector, a new rating the rating agencies provided to represent super safe investment) of the CDO market defaulted or severely downgraded during 2008, many of which obtained their ratings less than just a year previously.[citation needed]
  2. So far, no statistical models that attempt to predict equity market prices based on historical data are considered to consistently make correct predictions over the long term. One particularly memorable failure is that of Long Term Capital Management, a fund that hired highly qualified analysts, including a Nobel Memorial Prize in Economic Sciences winner, to develop a sophisticated statistical model that predicted the price spreads between different securities. The models produced impressive profits until a major debacle that caused the then Federal Reserve chairman Alan Greenspan to step in to broker a rescue plan by the Wall Street broker dealers in order to prevent a meltdown of the bond market.[citation needed]

Possible fundamental limitations of predictive models based on data fitting

History cannot always accurately predict the future. Using relations derived from historical data to predict the future implicitly assumes there are certain lasting conditions or constants in a complex system. This almost always leads to some imprecision when the system involves people.[citation needed]

Unknown unknowns are an issue. In all data collection, the collector first defines the set of variables for which data is collected. However, no matter how extensive the collector considers his/her selection of the variables, there is always the possibility of new variables that have not been considered or even defined, yet are critical to the outcome.[citation needed]

Algorithms can be defeated adversarially. After an algorithm becomes an accepted standard of measurement, it can be taken advantage of by people who understand the algorithm and have the incentive to fool or manipulate the outcome. This is what happened to the CDO rating described above. The CDO dealers actively fulfilled the rating agencies' input to reach an AAA or super-AAA on the CDO they were issuing, by cleverly manipulating variables that were "unknown" to the rating agencies' "sophisticated" models.[citation needed]

See also

References

  1. ^ Geisser, Seymour (1993). Predictive Inference: An Introduction. Chapman & Hall. p. [page needed]. ISBN 978-0-412-03471-8.
  2. ^ Finlay, Steven (2014). Predictive Analytics, Data Mining and Big Data. Myths, Misconceptions and Methods (1st ed.). Palgrave Macmillan. p. 237. ISBN 978-1137379276.
  3. ^ Sheskin, David J. (April 27, 2011). Handbook of Parametric and Nonparametric Statistical Procedures. CRC Press. p. 109. ISBN 978-1439858011.
  4. ^ Cox, D. R. (2006). Principles of Statistical Inference. Cambridge University Press. p. 2.
  5. ^ Willey, Gordon R. (1953), "Prehistoric Settlement Patterns in the Virú Valley, Peru", Bulletin 155. Bureau of American Ethnology
  6. ^ Heidelberg, Kurt, et al. "An Evaluation of the Archaeological Sample Survey Program at the Nevada Test and Training Range", SRI Technical Report 02-16, 2002
  7. ^ Jeffrey H. Altschul, Lynne Sebastian, and Kurt Heidelberg, "Predictive Modeling in the Military: Similar Goals, Divergent Paths", Preservation Research Series 1, SRI Foundation, 2004
  8. ^ "Hospital Uses Data Analytics and Predictive Modeling To Identify and Allocate Scarce Resources to High-Risk Patients, Leading to Fewer Readmissions". Agency for Healthcare Research and Quality. 2014-01-29. Retrieved 2019-03-19.
  9. ^ Banerjee, Imon; et al. (2018-07-03). "Probabilistic Prognostic Estimates of Survival in Metastatic Cancer Patients (PPES-Met) Utilizing Free-Text Clinical Narratives". Scientific Reports. 8 (10037 (2018)): 10037. Bibcode:2018NatSR...810037B. doi:10.1038/s41598-018-27946-5. PMC 6030075. PMID 29968730.
  10. ^ Collins, Gary; et al. (2024-04-16). "TRIPOD+AI statement: updated guidance for reporting clinical prediction models that use regression or machine learning methods". BMJ. doi:10.1136/bmj-2023-078378. PMC 11019967. PMID 38626948.
  11. ^ "Predictive-Model Based Trading Systems, Part 1 - System Trader Success". System Trader Success. 2013-07-22. Retrieved 2016-11-25.
  12. ^ "Predictive Modeling for Call Tracking". Phonexa. 2019-08-22. Retrieved 2021-02-25.

Further reading

Read other articles:

捍衛雅各可以指: 捍衛雅各 (小說),由威廉·藍迪所撰寫的小說 捍衛雅各 (迷你劇),Apple TV+所製作的影集 这是一个消歧义页,羅列了有相同或相近的标题,但內容不同的条目。如果您是通过某條目的内部链接而转到本页,希望您能協助修正该處的内部链接,將它指向正确的条目。

San Juanito MunicipioBanderaEscudo San JuanitoLocalización de San Juanito en Colombia San JuanitoLocalización de San Juanito en MetaCoordenadas 4°27′31″N 73°40′32″O / 4.4586111111111, -73.675555555556Entidad Municipio • País Colombia • Departamento MetaAlcalde Cesar Augusto Aya Rozo (2024-2027)Eventos históricos   • Fundación 17 de noviembre de 1913[1]​ • Erección 17 de noviembre de 1966[1]​Superficie   •...

Japanese manga series Shaman King: The Super StarCover of the first volumeGenreAdventure[1]Supernatural[1] MangaWritten byHiroyuki TakeiPublished byKodanshaEnglish publisherNA: Kodansha USA (digital)ImprintMagazine Edge KCMagazineShōnen Magazine Edge (2018–2023)Magazine Pocket (TBA)DemographicShōnenOriginal runMay 17, 2018 – presentVolumes7 Shaman King: The Super Star is a Japanese manga series written and illustrated by Hiroyuki Takei. It is a spin-off of Take...

Logotipo do PT Esta é uma lista de marcos eleitorais do Partido dos Trabalhadores (PT), ou seja, uma lista das primeiras vezes em que o partido logrou algum êxito notável em uma eleição. A lista inclui também marcos que o partido deixou na história do Brasil, e não apenas na do partido. Lista       Eleições gerais, com uma vaga por estado para o Senado Federal       Eleições gerais, com duas vagas por estado para o Senado...

Ахмеров — термін, який має кілька значень. Ця сторінка значень містить посилання на статті про кожне з них.Якщо ви потрапили сюди за внутрішнім посиланням, будь ласка, поверніться та виправте його так, щоб воно вказувало безпосередньо на потрібну статтю.@ пошук посилань са

Artikel ini tidak memiliki referensi atau sumber tepercaya sehingga isinya tidak bisa dipastikan. Tolong bantu perbaiki artikel ini dengan menambahkan referensi yang layak. Tulisan tanpa sumber dapat dipertanyakan dan dihapus sewaktu-waktu.Cari sumber: Pokok kaidah fundamental negara – berita · surat kabar · buku · cendekiawan · JSTOR Pokok kaidah fundamental negara (Staatsfundamentalnorm dalam bahasa Jerman) adalah kedudukan sebagai kaidah negara yang...

Coordenadas: 40° 03' N 1° 0' E  Nota: Este artigo é sobre uma comuna italiana. Para outros significados, veja Gallipoli (desambiguação). Galípoli    Comuna   Localização Ficheiro:PUG-Mappa.png GalípoliLocalização de Galípoli na Itália Coordenadas 40° 03' 20 N 17° 59' 30 E Região Apúlia Província Lecce Características geográficas Área total 40 km² População total 20 513 hab. Densidade 513 hab./km² Altitude 12&...

Papua New Guinea Institute of Medical ResearchSignboard at IMR GorokaEstablished1968 The Papua New Guinea Institute of Medical Research (PNG IMR) is the principal institution conducting health research in Papua New Guinea with a focus on health problems affecting the country's population. History The Papua New Guinea Institute of Medical Research is a statutory body of the Government of Papua New Guinea (PNG). The Institute was established in 1968 through an Act of Parliament and it effective...

село Лизогубова Слобода Країна  Україна Область Київська область Район Броварський район Громада Згурівська селищна громада Основні дані Засноване 1720 Населення 426 Площа 3,514 км² Густота населення 177,29 осіб/км² Поштовий індекс 07650 Телефонний код +380 4570 Географічні �...

Pakistan Army branch charged with the supply of weapons and ammunition Pakistan Army Corps of OrdnanceBadge of the Pakistan Army Ordnance CorpsActive1947; 76 years ago (1947)Country PakistanBranch Pakistan ArmyTypeCombat service supportRoleAdministrative and staffing oversight.SizeVariesHQ/GarrisonOrdnance Center in Malir Cantonment, Sindh, PakistanNickname(s)ORDAnniversaries1947EngagementsMilitary history of PakistanDecorationsMilitary Decorations of Pakistan mili...

この項目では、物理的な現象と、その応用について説明しています。その他の用法については「共鳴 (曖昧さ回避)」をご覧ください。 「レゾナンス」はこの項目へ転送されています。その他の用法については「レゾナンス (曖昧さ回避)」をご覧ください。 この記事は検証可能な参考文献や出典が全く示されていないか、不十分です。出典を追加して記事の信頼性向上に...

WWII prisoners of war The mother of a prisoner thanks Chancellor Konrad Adenauer upon his return from Moscow on September 14, 1955. Adenauer had succeeded in concluding negotiations for the release to Germany by the end of the year of 15,000 German civilians and prisoners of war Prisoners returning in 1955 Approximately three million German prisoners of war were captured by the Soviet Union during World War II, most of them during the great advances of the Red Army in the last year of the war...

Dreams of Distant Shores Cover of first editionAuthorPatricia A. McKillipCover artistThomas CantyCountryUnited StatesLanguageEnglishGenreFantasyPublisherTachyon PublicationsPublication date2016Media typePrint (trade paperback), ebookPages274ISBN978-1-61696-218-0OCLC920018313 Dreams of Distant Shores is a collection of fantasy stories by Patricia A. McKillip. It was first published in ebook by Tachyon Publications in May 2016, with the trade paperback print edition following from the...

City in Alabama, United States Opelika redirects here. For the unincorporated community in Texas, see Opelika, Texas. City in Alabama, United StatesOpelika, AlabamaCityDowntown OpelikaMotto: Rich in Heritage With a Vision for the FutureLocation of Opelika in Lee County, AlabamaCoordinates: 32°38′43″N 85°22′42″W / 32.64528°N 85.37833°W / 32.64528; -85.37833[1]CountryUnited StatesStateAlabamaCountyLeeGovernment • MayorGary Fuller (R)A...

Karl von Müffling Karl von Müffling, voor 1837 Bijnaam Weiß Geboren 12 juni 1775Halle Overleden 16 januari 1851Erfurt Land/zijde Pruisen Onderdeel Pruisische Leger Dienstjaren 1787 - 1847 Rang Generalfeldmarschall Eenheid Fuselier Bataljon von Schenck[1] Slagen/oorlogen Eerste Coalitieoorlog Slag bij Jena Slag bij Lübeck Onderscheidingen Zie onderscheidingen Ander werk Gouverneur van Parijs (1814)[2][1] Militair cartograaf van de Rijnprovincie (1816-1828)Gouve...

Series of dance music compilations This article relies largely or entirely on a single source. Relevant discussion may be found on the talk page. Please help improve this article by introducing citations to additional sources.Find sources: Euphoria compilations – news · newspapers · books · scholar · JSTOR (October 2015) The first compilation in the Euphoria series: 'For the Mind, Body and Soul' Euphoria, mixed by PF Project. Euphoria is a series ...

Railway stop in Hertfordshire, England Watford Stadium HaltWatford Stadium Halt in 2014.Watford Stadium HaltLocation of Watford Stadium Halt in HertfordshireLocationWatfordLocal authorityBorough of WatfordGrid referenceTQ102954Number of platforms1Railway companiesOriginal companyBritish RailKey dates4 December 1982 (1982-12-04)Station Opened?Last train called22 March 1996Services ceased25 March 1996Substitute bus service commenced29 September 2003Official closure and withdrawal...

William John Coffee (1774–1846) adalah seniman dan pematung Inggris berkelas internasional yang membuat patung dari porselen, plester, dan terracotta. Ia juga bekerja di pabrik minyak, meskipun demikian ini bukanlah tempat ia menjadi terkenal. Karier awalnya adalah membuat model untuk pabrik porselen Duesburry di Nottingham Road, Derby, Inggris. Akhir masa kehidupannya dihabiskan di Amerika.[1] Biografi Patung dada Erasmus Darwin, dibuat oleh Coffee tahun 1795 Selama tinggal di Derb...

VSHORAD Skyranger 30 A Skyranger 30 on a Boxer chassisTypeVSHORADProduction historyManufacturerRheinmetall Air Defence AGSpecificationsMassturret only: 2–2.5 tLengthturret only: 5.175 mmWidthturret only: 2.568 mmHeightturret only: 1.444 mmCrew3Caliber30×173 mmMaximum firing rangegun: 3000 mmissile: 5–8 kmArmorSTANAG 4569 Level 2, upgradable to Level 4 The Skyranger 30 is a short range air defense turret system developed Rheinmetall Air Defence AG (formerly Oerlikon) and first r...

Japanese manga series Good Night WorldFirst tankōbon volume coverグッド・ナイト・ワールド(Guddo Naito Wārudo) MangaWritten byUru Okabe [ja]Published byShogakukanImprintUra Sunday ComicsMagazineUra SundayMangaONEDemographicShōnenOriginal runDecember 28, 2015 – January 8, 2017Volumes5 MangaGood Night World EndWritten byUru OkabePublished byShogakukanMagazineMangaONEDemographicShōnenOriginal runAugust 1, 2023 – present Original net animationDi...