The workshop was organized with two all-group discussion slots at the beginning and the end of the workshop. In between, the workshop participants organized hackathon activities based on topics identified during the initial discussion and in submitted position papers. The following topic areas were identified and discussed.
The IETF holds a wide range of data sources. The main ones used are the [
Mail-Arch], [
IETF-RFCs], and [
Datatracker]. The latter provides information on participants, authors, meeting proceedings, minutes, [
Data-Overview]. Furthermore, there are [
IETF-Statistics], the working group Github repositories, and the IETF [
Survey-Data]. There was discussion about the utility of download statistics for the RFCs themselves from different repos.
There is a wide range of tools to analyze this data produced by IETF participants or researchers interested in the work of the IETF. Two projects that presented their work at the workshop were [
BigBang] and Sodestream's [
ietfdata] library. The RFC Prolog Database was described in a submitted paper; see [
Prolog-Database]. These projects could provide additional insight into existing [
ArkkoStats] and [
DatatrackerStats], e.g., gender-related information. Privacy issues and the implications of making such data publicly available were discussed as well.
The datatracker itself is a community tool that welcomes contributions; for example, for additions to the existing interfaces or the statistics page directly, see the [
Data-Overview]. At the time of the workshop, instructions about how to set up a local development environment could be found at [
DataResources]. Questions or discussion about the datatracker and possible enhancements can be sent to tools-discuss@ietf.org.
A large portion of the submitted position papers indicated interest in researching questions about industry control in the standardization process (as opposed to individual contributions in a personal capacity), where industry control covers both a) technical contributions and the ability to successfully standardize these contributions and b) competition on leadership roles. To assess these questions, investigating participant affiliations, including "indirect" affiliations (e.g., by tracking funding and changes in affiliation) was discussed. The need to model company characteristics or stakeholder groups was also discussed.
Discussion about the analysis of IETF data shows that affiliation dynamics are hard to capture due to the specifics of how the data is entered and because of larger social dynamics. On the side of IETF data capture, affiliation is an open text field that causes people to write their affiliation down in different ways (e.g., capitalization, space, word separation, etc). A common data format could contribute to analyses that compare SDO performance and behavior of actors inside and across standards bodies. To help with this, a draft data model was developed during the hackathon portion of the workshop; the data model can be found in
Appendix A.
Furthermore, there is the issue of mergers, acquisitions, and subsidiary companies. There is no authoritative exogenous source of variation for affiliation changes, so hand-collected and curated data is used to analyze changes in affiliation over time. While this approach is imperfect, conclusions can be drawn from the data. For example, in the case of mergers or acquisition where a small organization joins a large organization, this results in a statistically significant increase in likelihood of an individual being put in a working group chair position (see the document by [
LEADERSHIP-POSITIONS]).
The workshop participants were highly interested in using existing data to better understand who the current IETF community is. They were also interested in the community's diversity and how to potentially increase it and thereby increase inclusivity, e.g., understanding if there are certain factors that "drive people away" and why. Inclusivity and transparency about the standardization process are generally important to keep the Internet and its development process viable. As commented during the workshop discussion, when measuring and evaluating different angles of diversity, it is also important to understand the actual goals that are intended when increasing diversity, e.g., in order to increase competence (mainly technical diversity from different companies and stakeholder groups) or relevance (also regional diversity and international footprint).
The discussion on community and diversity spanned from methods that draw from novel text mining, time series clustering, graph mining, and psycholinguistic approaches to understand the consensus mechanism to more speculative approaches about what it would take to build a feminist Internet. The discussion also covered the data needed to measure who is in the community and how diverse it is.
The discussion highlighted that part of the challenge is defining what diversity means and how to measure it, or as one participant highlighted, defining "who the average IETFer is". There was a question about what to do about missing data or non-participating or underrepresented communities, like women, individuals from the African continent, and network operators. In terms of how IETF data is structured, various researchers mentioned that it is hard to track conversations because mail threads split, merge, and change. The ICANN-at-large model came up as an example of how to involve relevant stakeholders in the IETF that are currently not present. Conversely, it is also interesting for outside communities (especially policy makers) to get a sense of who the IETF community is and keep them updated.
The human element of the community and diversity was highlighted. In order to understand the IETF community's diversity, it is important to talk to people (beyond text analysis). In order to ensure inclusivity, individual participants must make an effort to, as one participant recounted, tell them their participation is valuable.
A number of submissions focused on the RFC publication process, on the development of standards and other RFCs in the IETF, and on how the IETF makes decisions. This included work on technical decisions about the content of the standards, on procedural and process decisions, and on questions around how we can understand, model, and perhaps improve the standards process. Some of the work considered what makes an RFC successful, how RFCs are used and referenced, and what we can learn about the importance of a topic by studying the RFCs, Internet-Drafts, and email discussions.
There were three sets of questions to consider in this area. The first question related to the success and failure of standards and considered:
-
What makes a successful and good RFC?
-
What makes the process of making RFCs successful?
-
How are RFCs used and referenced once published?
Discussion considered how to better understand the path from an Internet-Draft to an RFC, to see if there are specific factors that lead to successful development of an Internet-Draft into an RFC. Participants explored the extent to which this depends on the seniority and experience of the authors, on the topic and IETF area, on the extent and scope of mailing list discussion, and other factors, to understand whether success of an Internet-Draft can be predicted and whether interventions can be developed to increase the likelihood of success for work.
The second question focused on decision making.
-
How does the IETF make design decisions?
-
What are the bottlenecks in effective decision making?
-
When is a decision made? And what is the decision?
Difficulties here lie in capturing decisions and the results of consensus calls early in the process, and understanding the factors that lead to effective decision making.
Finally, there were questions regarding what can be learned about protocols by studying IETF publications, processes, and decision making. For example:
-
Are there insights to be gained around how security concerns are discussed and considered in the development of standards?
-
Is it possible to verify correctness of protocols and detect ambiguities?
-
What can be learned by extracting insights from implementations and activities on implementation efforts?
Answers to these questions will come from analysis of IETF emails, RFCs and Internet-Drafts, meeting minutes, recordings, Github data, and external data such as surveys, etc.
The final discussion session considered environmental sustainability. Topics included what the IETF's role with respect to climate change, both in terms of what is the environmental impact of the way the IETF develops standards and in terms of what is the environmental impact of the standards the IETF develops.
Discussion started by considering how sustainable IETF meetings are, focusing on the amount of carbon dioxide (CO2) emissions IETF meetings are responsible for and how can we make the IETF more sustainable. Analysis looked at the home locations of participants, meeting locations, and carbon footprint of air travel and remote attendance to estimate the CO2 costs of an IETF meeting. While the analysis is ongoing, initial results suggest that the costs of holding multiple in-person IETF meetings per year are likely unsustainable in terms of CO2 emission.
The extent to which climate impacts are considered during the development and standardization of Internet protocols was discussed. RFCs and Internet-Drafts of active working groups were reviewed for relevant keywords to highlight the extent to which climate change, energy efficiency, and related topics were considered in the design of Internet protocols. This review revealed the limited extent to which these topics have been considered. There is ongoing work to get a fuller picture by reviewing meeting minutes and mail archives as well, but initial results show only limited consideration of these important issues.