Ideographic Research Group

The Ideographic Research Group (IRG), formerly called the Ideographic Rapporteur Group, is a subgroup of Working Group 2 (WG2) of ISO/IEC JTC1 Subcommittee 2 (SC2), which is the committee responsible for developing the Universal Coded Character Set (ISO/IEC 10646). IRG is tasked with preparing and reviewing sets of CJK unified ideographs for eventual inclusion in both ISO/IEC 10646 and The Unicode Standard.[1][2] The IRG is composed of representatives from national standards bodies from China, Japan, South Korea, Vietnam, and other regions that have historically used Chinese characters, as well as experts from liaison organizations such as the SAT Daizōkyō Text Database Committee (SAT), Taipei Computer Association (TCA), and the Unicode Technical Committee (UTC). The group holds two meetings every year lasting 4-5 days each, subsequently reporting its activities to its parent ISO/IEC JTC 1/SC 2 (SC2/WG2) committee.

History

Ken Lunde, IRG convenor since June 2024

The precursor to the IRG was the CJK Joint Research Group (CJK-JRG), established in 1990. In May 1993, this group was re-established as the Ideographic Rapporteur Group (IRG) as a subgroup of WG2.[3][4] In June 2019, the subgroup acquired its current name.[2]

The first IRG rapporteur was Kato Shigenobu (加藤重信), from 1993 to 1994, followed by Kido Akio (木戸彰夫) from 1994 to 1995.[4] From 1995 to 2004, the IRG rapporteur was Zhang Zhoucai (张轴材), who had been convenor and chief editor of CJK-JRG from 1990 to 1993. From 2004 to 2018 the IRG rapporteur was Hong Kong Polytechnic University professor Lu Qin (陸勤),[1][5] but in June 2018 the title of "rapporteur" was changed to "convenor", and Lu Qin continued as IRG convenor for another six years.[6] Since June 2024, the IRG convenor has been Ken Lunde.[7]

Overview

IRG is responsible for reviewing proposals to add new CJK unified ideographs to the Universal Multiple-Octet Coded Character Set (ISO/IEC 10646), and equivalently the Unicode Standard, and submitting consolidated proposals for sets of unified ideographs to WG2, which are then processed for encoding in the respective standards by SC2 and the Unicode Technical Committee.[8][9] National and liaison bodies that have been represented in IRG include China, Hong Kong, Macau, Japan (no longer active), North Korea (no longer active), South Korea, Singapore (no longer active), the Taipei Computer Association (TCA), the United Kingdom, Vietnam, and the Unicode Technical Committee (UTC).

As of Unicode version 16.0, the IRG has been responsible for submitting the following blocks of CJK unified and compatibility ideographs for encoding:[10]

Since 2015, proposed characters submitted by IRG member bodies have been processed in batches called "IRG Working Sets". Each working set undergoes several years of review by IRG experts before official submission of the working set to WG2 as a new block. Once accepted by WG2, the proposed block is processed according to the individual procedures followed by ISO/IEC JTC1 SC2 and the Unicode Technical Committee (UTC). In the case of SC2, this involves balloting of ISO member bodies.[11] The following working sets have been processed by IRG:

WS2015. 5,547 submitted characters which resulted in 4,939 characters encoded in CJK Unified Ideographs Extension G (Unicode version 13.0, March 2020):

  • China: 2,277 submitted characters (1,268 Zhuang characters, 1,009 characters from the Hanyu Da Zidian (汉语大字典) dictionary)
  • Republic of Korea: 469 submitted characters
  • SAT: 350 submitted characters
  • TCA: 500 submitted characters
  • United Kingdom: 1,640 submitted characters
  • UTC: 311 submitted characters[12]

WS2017. 5,027 submitted characters which resulted in 4,192 characters encoded in CJK Unified Ideographs Extension H (Unicode version 15.0, September 2022):

  • China: 963 submitted characters (143 person name characters, 354 place name characters, 29 characters from the Hanyu Da Cidian (汉语大词典) dictionary, 33 characters from the Dictionary of Chinese Medicine (中医字典), and 404 Zhuang characters)
  • Republic of Korea: 686 submitted characters
  • SAT: 305 submitted characters
  • TCA: 895 submitted characters
  • United Kingdom: 1,001 submitted characters
  • UTC: 193 submitted characters
  • Vietnam: 984 submitted characters[13]

WS2021. 4,951 submitted characters which may result in up to 4,302 characters to be encoded in CJK Unified Ideographs Extension J in a future version of Unicode:[14]

  • China: 1,223 submitted characters (151 place name characters, 768 science and technology characters, 4 person name characters, and 300 Zhuang characters)
  • Republic of Korea: 191 submitted characters
  • SAT: 383 submitted characters
  • TCA: 1,000 submitted characters
  • United Kingdom: 1,000 submitted characters
  • UTC: 153 submitted characters
  • Vietnam: 1,001 submitted characters[15]

WS2024. A total of 4,674 characters were submitted for Working Set 2024 in July 2024 by China, Republic of Korea, SAT, TCA, United Kingdom, UTC, and Vietnam:[4]

  • China: 1,000 submitted characters, of which 700 are Chinese characters, and 300 are Zhuang characters
  • Republic of Korea: 178 submitted characters
  • SAT: 252 submitted characters
  • TCA: 1,000 submitted characters
  • United Kingdom: 1,000 submitted characters
  • UTC: 244 submitted characters
  • Vietnam: 1,000 submitted characters

References

  1. ^ a b "ISO/IEC JTC1/SC2/WG2/IRG: Ideographic Rapporteur Group".
  2. ^ a b "Resolutions of the 24th ISO/IEC JTC 1/SC 2 Plenary Meeting, Redmond, WA, US, 2019-06-17 and 21". ISO/IEC JTC 1/SC 2. 24 June 2019. Retrieved 24 June 2019.
  3. ^ The Unicode Consortium (2021). "Han Unification History: Ideographic Rapporteur Group". The Unicode Standard, Version 14.0.0 (PDF). The Unicode Consortium. p. 987. ISBN 978-1-936213-29-0.
  4. ^ a b c "Ideographic Research Group (ISO/IEC JTC 1/SC 2/WG 2/IRG)". Retrieved 29 July 2024.
  5. ^ "LU, Qin(Lu Chin)". Archived from the original on 22 September 2020. Retrieved 24 June 2019.
  6. ^ "Resolutions of the 23rd ISO/IEC JTC 1/SC 2 Plenary Meeting, London, UK, 2018-06-18, 22". ISO/IEC JTC 1/SC 2. 28 June 2018. Retrieved 24 June 2019.
  7. ^ "Recommendations from WG 2 meeting 71" (PDF). 14 June 2024. Retrieved 20 July 2024.
  8. ^ "Unicode Standard Annex #45: U-source Ideographs". The Unicode Standard. Unicode Consortium.
  9. ^ "Appendix E: Han Unification History" (PDF). The Unicode Standard. Unicode Consortium. September 2021.
  10. ^ "Ideographic Rapporteur Group". Office of the Government Chief Information Officer.
  11. ^ "FAQ - Chinese and Japanese".
  12. ^ "IRG2133: IRG 2015 Collection Version 1.1 attributes". Retrieved 2024-05-08.
  13. ^ "IRG Working Set 2017 - Index of Characters". Retrieved 2024-05-08.
  14. ^ "IRGN2678: WS 2021 V7.0". Retrieved 2024-05-08.
  15. ^ "IRG Working Set 2021 - Index of Characters". Retrieved 2024-05-08.

IRG Working Sets

IRG Working Document Series (IWDS)