Momin M. Malik

I am a researcher who works on healthcare AI, and on critical quantitative methods.

My 2020 preprint, A hierarchy of limitations in machine learning, is my major individual work.

Info

I am faculty at the Mayo Clinic where I research machine learning methodology and ethics, applied to pediatric care. Previously, I was Senior Data Science Analyst - AI Ethics at Mayo Clinic’s Center for Digital Health, where I worked on translating AI to practice. Before that, I was Director of Data Science at Avant-garde Health, a healthcare data startup born out of Harvard Business School’s value-based healthcare research.

I am also a fellow at and co-director of the Institute in Critical Quantitative, Computational, & Mixed Methodologies at Johns Hopkins University, and an instructor at the University of Pennsylvania’s School of Social Policy and Practice.

Download my CV. Last updated 13 June 2025.

Published work

Austin M. Stroud, Michele D. Anzabi, Journey L. Wise, Barbara A. Barry, Momin M. Malik, Michelle L. McGowan, and Richard R. Sharp. 2024. Toward safe and ethical implementation of healthcare AI: Insights from an Academic Medical Center. Mayo Clinic Proceedings: Digital Health. doi: 10.1016/j.mcpdig.2024.100189. [MCP link]

Tracey A. Brereton, Momin M. Malik, Lauren M. Rost, Joshua W. Ohde, Lu Zheng, Kristelle A. Jose, Kevin J. Peterson, David Vidal, Mark A. Lifson, Joe Melnick, Bryce Flor, Jason D. Greenwood, Kyle Fisher, and Shauna M. Overgaard. 2024. AImedReport: A prototype tool to facilitate research reporting and translation of Artificial Intelligence technologies in health care. Mayo Clinic Proceedings: Digital Health 2 (2): 246–251. doi: 10.1016/j.mcpdig.2024.03.008. [MCP link]

Sayash Kapoor, Emily Cantrell, Kenny Peng, Thanh Hien Pham, Christopher A. Bail, Odd Erik Gundersen, Jake M. Hofman, Jessica Hullman, Michael A. Lones, Momin M. Malik, Priyanka Nanayakkara, Russel A. Poldrack, Inioluwa Deborah Raji, Michael Roberts, Matthew J. Salganik, Marta Serra-Garcia, Brandon M. Stewart, Gilles Vandewiele, and Arvind Narayanan. 2024. Reforms: Consensus-based recommendations for machine-learning-based science. Science Advances 10 (18): eadk3452. doi: 10.1126/sciadv.adk3452. [Science link]

Raphael Frankfurter, Maya Malik, Sahr David Kpakiwa, Timothy McGinnis, Momin M. Malik, Smit Chitre, Mohamed Bailor Barrie, Yusupha Dibba, Lulwama Mulalu, Raquel Baldwinson, Mosoka Fallah, Ismail Rashid, J. Daniel Kelly, and Eugene T. Richardson. 2024. Representations of an Ebola ‘outbreak’ through story technologies. BMJ Global Health 9 (2): e013210. doi: 10.1136/bmjgh-2023-013210. [BMJ link]

Young J. Juhn, Momin M. Malik, Euijung Ryu, Chung-Il Wi, and John D. Halamka. 2024. Chapter 47 - Socioeconomic bias in applying artificial intelligence models to health care. In Artificial Intelligence in clinical practice: How AI technologies impact medical research and clinics, edited by Chayakrit Krittanawong, 413–435. Academic Press. doi: 10.1016/B978-0-443-15688-5.00044-9. [Elsevier link (paywall)]

Tracey A. Brereton, Momin M. Malik, Mark A. Lifson, Jason D. Greenwood, Kevin J. Peterson, and Shauna M. Overgaard. 2023. The role of AI model documentation in translational science: A scoping review. Interactive Journal of Medical Research 12: e45903. doi: 10.2196/45903. [JMIR Link]

The Avant-Garde Health and Codman Shoulder Society Value Based Care Group, Adam Z. Khan, Matthew J. Best, Catherine J. Fedorka, Robert M. Belniak, Derek A. Haas, Xiaoran Zhang, April D. Armstrong, Andrew Jawa, Evan A. O’Donnell, Jason E. Simon, Eric R. Wagner, Momin Malik, Michael B. Gottschalk, Gary F. Updegrove, Eric C. Makhni, Jon J. P. Warner, Uma Srikumaran, and Joseph A. Abboud. 2022. Impact of the COVID-19 pandemic on shoulder arthroplasty: Surgical trends and postoperative care pathway analysis. Journal of Shoulder and Elbow Surgery 31 (12): 2457–2464. doi: 10.1016/j.jse.2022.07.020. [JSES link]

Angelina Mooseder, Momin M. Malik, Hemank Lamba, Earth Erowid, Sylvia Thyssen, and Jürgen Pfeffer. 2022. Glowing experience or bad trip? A quantitative analysis of user reported drug experiences on Erowid.org. In Proceedings of the Sixteenth International AAAI Conference on Web and Social Media (ICWSM-2022), 675–689. [AAAI Digital Library] [arXiv preprint (includes supplementary appendix)]

Young J. Juhn, Euijung Ryu, Chung-Il Wi, Katherine S. King, Momin Malik, Santiago Romero-Brufau, Chunhua Weng, Sunghwan Sohn, Richard R. Sharp, and John D. Halamka. 2022. Assessing socioeconomic bias in machine learning algorithms in health care: A case study of the HOUSES index. Journal of the American Medical Informatics Association 29 (7): 1142–1151. doi: 10.1093/jamia/ocac052. [AMIA link (paywall)]

Nicole C. Nelson, Kelsey Ichikawa, Julie Chung, and Momin M. Malik. 2022. Psychology exceptionalism and the multiple discovery of the replication crisis. Review of General Psychology 26 (2): 184–198. doi: 10.1177/10892680211046508. [SAGE link (paywall)] [MetaArXiv:sbv3q (preprint)]

Maya Malik and Momin M. Malik. 2022. Critical technical awakenings. Journal of Social Computing 2 (4): 365–384. doi: 10.23919/JSC.2021.0035. [IEEE link]

Nicole C. Nelson, Kelsey Ichikawa, Julie Chung, and Momin M. Malik. 2021. Mapping the discursive dimensions of the reproducibility crisis: A mixed methods analysis. PLOS ONE 16 (7): e0254090. doi: 10.1371/journal.pone.0254090. [PLOS ONE link] [MetaArXiv:sbv3q (preprint)]

Hal Roberts, Rahul Bhargava, Linas Valiukas, Dennis Jen, Momin M. Malik, Cindy Bishop, Emily Ndulue, Aashka Dave, Justin Clark, Bruce Etling, Rob Faris, Anushka Shah, Jasmin Rubinovitz, Alexis Hope, Catherine D’Ignazio, Fernando Bermejo, Yochai Benkler, and Ethan Zuckerman. 2021. Media Cloud: Massive open source collection of global news on the open web. In Proceedings of the Fifteenth International AAAI Conference on Web and Social Media (ICWSM-2021), 1034–1045. [AAAI Digital Library] [Preprint (with appendix)]

Eugene T. Richardson, Momin M. Malik, William A. Darity, Jr., A. Kirsten Mullen, Michelle E. Morse, Maya Malik, Adia Benton, Mary T. Bassett, Paul E. Farmer, Lee Worden, and James Holland Jones. 2021. Reparations for Black American descendants of persons enslaved in the U.S. and their potential impact on SARS-CoV-2 transmission. Social Science & Medicine 276: 113741. doi: 10.1016/j.socscimed.2021.113741. [Science Direct link] [Supplementary Material]

Diego Alburez-Gutierrez, Eshwar Chandrasekharan, Rumi Chunara, Sofia Gil-Clavel, Aniko Hannak, Roberto Inter- donato, Kenneth Joseph, Kyriaki Kalimeri, Momin M. Malik, Katja Mayer, Yelena Mejova, Daniela Paolotti, and Emilio Zagheni. 2019. Reports of the workshops held at the 2019 International AAAI Conference on Web and Social Media. AI Magazine 40 (4): 78–82. doi: 10.1609/aimag.v40i4.5287. [AAAI Digital Library]

Kar-Hai Chu, Jason Colditz, Momin M. Malik, Tabitha Yates, and Brian Primack. 2019. Identifying key target audiences for public health campaigns: Leveraging machine learning in the case of hookah tobacco smoking. Journal of Medical Internet Research 21 (7): e12443. doi: 10.2196/12443. [JMIR link]

Momin M. Malik. 2018. Bias and beyond in digital trace data. PhD dissertation, Carnegie Mellon University School of Computer Science. [SCS Technical Report Collection] [Defense slides]

Jürgen Pfeffer and Momin M. Malik. 2017. Simulating the dynamics of socio-economic systems. In Networked governance: New research perspectives, edited by Betina Hollstein, Wenzel Matiaske, and Kai-Uwe Schnapp, 143–161. Cham, Switzerland: Springer. doi: 10.1007/978-3-319-50386-8_9. [Springer link (paywall)] [Authors’s copy (contains minor corrections)] [Full-sized vector image of my recreation of the World3 diagram] [BibTeX]

Momin M. Malik and Jürgen Pfeffer. 2016. Identifying platform effects in social media data. In Proceedings of the Tenth International AAAI Conference on Web and Social Media (ICWSM-16), 241–249. May 18–20, 2016, Cologne, Germany. [Updated version, Chapter 2 from dissertation] [AAAI Digital Library] [ICWSM slides] [IC2S2 slides] [Sunbelt slides] [BibTeX]

Momin M. Malik and Jürgen Pfeffer. 2016. A macroscopic analysis of news in Twitter. Digital Journalism 4 (8), 955–979. doi: 10.1080/21670811.2015.1133249. [Taylor & Francis link (paywall)] [Preprint] [BibTeX]

Gabriel Ferreira, Momin Malik, Christian Kästner, Jürgen Pfeffer, and Sven Apel. 2016. Do #ifdefs influence the occurrence of vulnerabilities? An empirical study of the Linux Kernel. In Proceedings of the 20th International Systems and Software Product Line Conference (SPLC ’16), 65–73. September 19–23, 2016, Bejing, China. doi: 10.1145/2934466.2934467. Nominated for Best Paper Award. [ACM link] [arXiv preprint] [BibTeX]

Kathleen M. Carley, Momin Malik, Peter M. Landwehr, Jürgen Pfeffer, and Michael Kowalchuck. 2016. Crowd sourcing disaster management: The complex nature of Twitter usage in Padang Indonesia. Safety Science 90, 48–61. doi: 10.1016/j.ssci.2016.04.002. [ScienceDirect link (paywall)]

Hemank Lamba, Momin M. Malik, and Jürgen Pfeffer. 2015. A tempest in a teacup? Analyzing firestorms on Twitter. In Proceedings of the 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2015 (ASONAM 2015), 17–24. August 25–28, 2015, Paris, France. doi: 10.1145/2808797.2808828. Best student paper award. [ACM link] [BibTeX]

Momin M. Malik, Hemank Lamba, Constantine Nakos, and Jürgen Pfeffer. 2015. Population bias in geotagged tweets. In Papers from the 2015 ICWSM Workshop on Standards and Practices in Large-Scale Social Media Research (ICWSM-15 SPSM), 18–27. May 26, 2015, Oxford, UK. [Updated version, Chapter 1 from dissertation] [AAAI Digital Library] [Slides] [BibTeX]

Reports and blogging

Christelle Tessono, Yuan Stevens, Momin M. Malik, Supriya Dwivedi, Sonja Solomun, and Sam Andrey. 2022. AI oversight, accountability and protecting human rights: Comments on Canada’s proposed Artificial Intelligence and Data Act. November 2. Cybersecure Policy Exchange, Center for Information Technology Policy at Princeton University, and Centre for Media, Technology and Democracy at McGill University. [Report website]

Momin M. Malik. 2019. Can algorithms themselves be biased? Medium, Berkman Klein Center Collection. April 24, 2019. [Medium link] [Mobile-friendly PDF]

Io Flament, Cristina Lozano, and Momin M. Malik. 2017. Data-driven planning for sustainable tourism in Tuscany. Cascais, Portugal: Data Science for Social Good Europe. [Report]

Preprints

Momin M. Malik, Afsaneh Doryab, Michael Merrill, Jürgen Pfeffer, and Anind K. Dey. 2020. Can smartphone co-locations detect friendship? It depends how you model it. [arXiv:2008.02919]

Momin M. Malik. 2020. A hierarchy of limitations in machine learning. [arXiv:2002.05193]

Previous works

These are works done before my PhD. I am still proud of them, but they are quite different from my subsequent research.

Urs Gasser, Momin Malik, Sandra Cortesi, and Meredith Beaton. 2013. Mapping approaches to news literacy curriculum development: A navigation aid. Berkman Center Research Publication No. 2013-25. [SSRN link]

Momin Malik, Sandra Cortesi, and Urs Gasser. 2013. The challenges of defining ‘news literacy’. Berkman Center Research Publication No. 2013-20. [SSRN link]

Momin M. Malik. 2013. The role of incumbency in field emergence: The case of Internet studies. Poster presented at the Science of Team Science (SciTS) Conference 2013, Northwestern University, Evanston, IL, June 24–27, 2013. [PDF]
(Note that this is a poster version of my MSc thesis, adapted for the topic of SciTS. Also, I have since realized the error of a non-statistical approach to significance claims.)

Momin M. Malik. 2012. Networks of collaboration and field emergence in ‘Internet Studies’. Thesis submitted in partial fulfillment of the degree of MSc in Social Science of the Internet at the Oxford Internet Institute at the University of Oxford. Oxford Internet Institute, University of Oxford, Oxford, UK. [PDF]

Urs Gasser, Sandra Cortesi, Momin Malik, and Ashley Lee. 2012. Youth and digital media: From credibility to information quality. Berkman Center Research Publication No. 2012-1. [SSRN link]

Urs Gasser, Sandra Cortesi, Momin Malik, and Ashley Lee. 2010. Information quality, youth, and media: A research update. Youth Media Reporter. [Online]

Momin M. Malik. 2009. Survey of state initiatives for conservation of coastal habitats from sea-level rise. Rhode Island Coastal Resources Management Council. [PDF]

Momin M. Malik. 2008. Rediscovering Ramanujan. Thesis submitted in partial fulfillment for an honors degree in History and Science. The Department of the History of Science, Harvard University, Cambridge, MA. [PDF]

Presentations

Panelist for “LLM for clinical evidence extraction, retrieval, and summarization” keynote by Yifang Peng, moderated by Eric Williamson. Mayo Clinic 2025 AI Summit. Department of Artificial Intelligence & Informatics, Mayo Clinic. Rochester, MN [hybrid], July 7, 2025.

“Introduction to computational and data science in the social sciences” (invited talk). 2025 ICQCM Summit. Institute in Critical Quantitative, Computational, & Mixed Methodologies, School of Education, Johns Hopkins University. Baltimore, MD, June 7, 2025. [Slides]

“Assumptions, inferences, and 𝑝-values in critical perspective” (invited talk). 2025 ICQCM Summit. Institute in Critical Quantitative, Computational, & Mixed Methodologies, School of Education, Johns Hopkins University. Baltimore, MD, June 7, 2025. [Slides]

“Introduction to data analysis with RStudio” (invited talk). 2025 ICQCM Summit. Institute in Critical Quantitative, Computational, & Mixed Methodologies, School of Education, Johns Hopkins University. Baltimore, MD, June 6, 2025. [Slides]

“Ethical implementation of AI in healthcare: Reasoning about the limits of AI” (invited talk). Mayo Clinic’s Innovations in Gastroenterology and Hepatology 2024: AI and Beyond. School of Continuous Professional Development, Mayo Clinic. Denver, CO, September 20, 2024.

Talk and discussion (invited) on “A hierarchy of limitations in machine learning.” Department of Philosophy of Science & Technology of Computer Simulation, High Performance Computing Center Stuttgart, University of Stuttgart. Stuttgart, Germany [remote], June 6, 2024. [Slides]

“Critical technical awakenings” (invited talk). With Maya Malik. Module 3: Theories and Methods, Easter 2022 (Instructor: Dr. Jonnie Penn). Master of Studies in AI Ethics and Society, Institute of Continuing Education, University of Cambridge. Cambridge, UK [remote], June 3, 2024. [Slides]

“What AI ethics ought to be about: The nature, limits, and consequences of non-causal, outcome-only modeling” (invited talk). Lunch & Learn, Department of Artificial Intelligence & Informatics, Mayo Clinic. Rochester, MN, January 25, 2024.

“AI translation for healthcare: Ethics & bias.” 2023 FDA Experiential Learning Program, virtual site visit to Software as a Medical Device (SaMD) Regulatory group. Center for Digital Health, Mayo Clinic. Rochester, MN [held online], September 20, 2023.

“What AI ethics ought to be about: A framework for healthcare based on AI as correlation-only modeling” (invited talk). AI Speaker Series. Center for Digital Health, Mayo Clinic. Rochester, MN [held online], June 19-21, 2023.

“Don’t trust explainable AI: Proper validation is what matters” (poster presentation). AI Summit. Department of Artificial Intelligence & Informatics, Mayo Clinic. Rochester, MN, June 19-21, 2023. [Poster]

“Critique and quantitative methods in the case for reparations”. The FXB Center’s Making the Public Health Case for Reparations Methods Workshop. François-Xavier Bagnoud Center for Health and Human Rights at Harvard University. Boston, MA, June 5, 2023. [Slides]

“Generalizability, meaningfulness, and meaning: Machine learning in the social world” (invited talk). Seminario Conjunto de Estadística y Ciencia de Datos [Joint Seminar on Statistics and Data Science], Centro de Investigación en Matemáticas (CIMAT). Guanajuato, Mexico [held online], May 10, 2023. [Slides]

“Conceptualizing progress: Beyond the ‘common task framework’.” Part of “Beyond the data and model: Integration, enrichment, and progress”, Webinar 3 in support of the NIH/NCATS Bias Detection Tools for Clinical Decision Making Challenge. With Shauna M. Overgaard, Young J. Juhn, and Chung Il Wi. National Center for Advancing Translational Sciences, National Institutes of Health. [online], February 17, 2023. [Slides] [Video]

Invited panelist, “Incorporating ethical thinking into research & innovation through education, planning, conduct, and communication.” Chaired by Michael Hawes, with session organizers Jing Cao and Stephanie Shipp, and co-panelists James Giordano, Jeri Mulrow, Katie Shay, and Nathan Colaner. Sponsored by Committee on Professional Ethics (primary), Statistics Without Borders, Committee on Scientific Freedom and Human Rights, and Committee on Funded Research. JSM 2022 Invited Session for ASA COPE, Joint Statistical Meeting 2022. Washington, DC, August 7, 2022.

“When (and why) we shouldn’t expect reproducibility in machine learning-based science: Culture, causality, and metrics as estimators” (invited talk). The Reproducibility Crisis in ML-based Science [workshop]. Center for Statistics and Machine Learning, Princeton University. [online], July 28, 2022. [Slides] [Video]

“Ethical considerations for measuring impact to health care” (invited guest lecture). AIHC 5030: Introduction to Deployment, Adoption & Maintenance of Artificial Intelligence Models/Algorithms, Spring 2022 (Instructors: Dr. Shauna Overgaard, PhD, and Dr. Chris Aakre, MD). Artificial Intelligence in Health Care Track, Mayo Clinic Graduate School of Biomedical Sciences. Rochester, MN [remote], June 2, 2022.

“Ethics in the lifecycle of AI: From research and development to clinical implementation” (invited guest lecture). CTSC 5350: Ethical Issues in Artificial Intelligence and Information Technologies, Spring 2022 (Instructors: Dr. Richard Sharp, PhD and Dr. Barbara Barry, PhD). Clinical and Translational Sciences Track, Mayo Clinic Graduate School of Biomedical Sciences. Rochester, MN [remote], May 31, 2022.

“A critical perspective on measurement in digital trace data and machine learning, and implications for demography” (invited talk). Max Planck Institute for Demographic Research Seminar Series, Rostock, Germany, April 26, 2022. [Slides]

Invited panelist, “Predictive justice.” With co-panelist Safiya Noble. AERA Presidential Session, “Expansive futures for disability intersectional learning research: Braiding culture, history, equity, and enabling technologies.” 2022 American Educational Research Association Annual Meeting. San Diego, CA, Saturday, April 23, 2022.

“The technical perspective on ethics: An overview and critique” (invited talk). Center for Digital Ethics and Policy 2022 Annual International Symposium: Digital Ethics for a Sustainable Society. School of Communication, Loyola University Chicago. [held online], March 29, 2022. [Slides]

“Critical approaches to machine learning” (invited session). ICQCM Summit, Baltimore, MD, Sunday, March 21, 2022. [Slides]

Invited panelist, “AI, race, and algorithmic justice in research.” Moderated by Ezekiel Dixon-Román, with co-panelists Meredith Broussard and Kadija Ferryman. ICQCM Summit, Baltimore, MD, Saturday, March 22, 2022.

Invited panelist, “Approaches to managing trustworthy AI.” Moderated by Maggie Little, with co-panelists Ashley Casovan, Jacob Metcalf, Rayid Ghani, and Ram Kumar. Panel 5 at Kicking off NIST AI Risk Management Framework workshop. National Institutes of Standards and Technology, U.S. Department of Commerce. [held online], October 20, 2021.

“Machine learning in the hierarchy of methodological limitations” (invited talk). TILT Seminar Series 2021, Tilburg Institute for Law, Technology, and Society, Tilburg University. Tilburg, Netherlands [held online], September 21, 2021. [Slides]

“Networks and graphical models: A survey.” Networks 2021. July 6, 2021 [delivered online]. [Slides]

“Defining critical quantitative and computational methodologies” (invited session). Moderated by Ezekiel Dixon-Román. William T. Grant AQC SCHOLARS Virtual Seminar Series, Institute in Critical Quantitative, Computational, & Mixed Methodologies, Johns Hopkins University. May 27, 2021 [delivered online]. [Slides]

“Media Cloud: Massive open source collection of global news on the open Web.” Fifteenth International AAAI Conference on Web and Social Media (ICWSM-2021). [held online], June 10, 2021.

“Critical theory and quantification” (invited session). With Maya Malik and Ezekiel Dixon-Román. Histories of Artificial Intelligence: A Genealogy of Power, Mellon Sawyer Seminar, University of Cambridge. January 20, 2021 [delivered online]. [Slides]

“A hierarchy of limitations in machine learning” (invited talk). Math and Democracy Seminar Series, Center for Data Science, New York University. October 5, 2020, New York, New York [delivered online]. [Slides]

“A hierarchy of limitations in machine learning: Data biases and the social sciences” (invited webinar). Webinar Series: Data Cultures in Higher Education, Faculty of Psychology and Education, Universitat Oberta de Catalunya (Open University of Catalonia). September 29, 2020, Barcelona, Spain [online]. [Slides] [Video, with Spanish subtitles (Una jerarquía de limitaciones en el Machine Learning. Sesgos en su uso en investigaciín social)]

“Machine learning won’t save us: Dependencies bias cross-validation estimates of model performance.” 2020 Sunbelt Virtual Conference of the International Network for Social Network Analysis. July 17, 2020. [Slides]

“Anti-racism and COVID-19” (invited talk). With Eugene T. Richardson, William A. Darity, Jr., James Holland Jones, A. Kirsten Mullen, and Paul E. Farmer. Global Health and Social Medicine Seminar Series, Department of Global Health & Social Medicine, Harvard Medical School. June 3, 2020, Cambridge, Massachusetts [delivered online].

“Antiracism and COVID-19” (invited talk). With Eugene T. Richardson. Antiracism & Technology Design Seminar, Space Enabled research group, MIT Media Lab. May 13, 2020, Cambridge, Massachusetts [delivered online].

“Critical technical practice revisited: Towards `analytic actors’ in data science” (invited talk). STS Circle, Program on Science, Technology & Society, Harvard Kennedy School. March 5, 2020, Cambridge, Massachusetts. [Slides]

“Revisiting ‘all models are wrong’: Addressing limitations in big data, machine learning, and computational social science” (invited talk). Wednesdays@NICO Seminar Speaker Series, Northwestern Institute on Complex Systems, Northwestern University. February 5, 2020, Evanston, Illinois. [Slides] [Video]

“How STS can improve data science” (invited talk). Science, Technology and Society Lunch Seminar, Tufts University. January 23, 2020, Medford, Massachusetts. [Slides]

“A hierarchy of limitations in machine learning” (invited talk). Microsoft Research New England. December 3, 2019, Cambridge, Massachusetts. [Slides]

“Correlates of oppression: Machine learning and society” (invited talk). Guest lecture in MIT CMS.701/CMS.901: Current Debates in Media, Fall 2019 (Instructor: Dr. Sasha Costanza-Chock). Comparative Media Studies, Massachusetts Institute of Technology. October 30, 2019, Cambridge, Massachusetts. [Slides]

“Statistics and machine learning: Foundations, limitations, and ethics” (invited talk). Colby College Department of Mathematics and Statistics, Colloquium Fall 2019, Colby College. October 7, 2019, Waterville, Maine. [Slides]

“A critical introduction to machine learning.” 2019 ACM Richard Tapia Celebration of Diversity in Computing Conference. September 19, 2019, Marriott Marquis San Diego Marina, San Diego, California. [Slides]

“Everything you ever wanted to know about network statistics but were afraid to ask.” XXXIX Sunbelt Social Networks Conference of the International Network for Social Network Analysis. June 18, 2019, UQAM, Montreal, Quebec. [Slides] [R script]

“Three open problems for historians of AI.” Towards a History of Artificial Intelligence, Columbia University. May 24, 2019, New York, New York. [Slides] [Video]

“Interpretability is a red herring: Grappling with ‘prediction policy problems.’ 17th Annual Information Ethics Roundtable: Justice and Fairness in Data Use and Machine Learning. April 5, 2019, Northeastern University, Boston, Massachusetts. [Slides and draft] [Draft only]

“What can AI do with copyrighted data?” (invited talk). Bracing for Impact – The Artificial Intelligence Challenge: A Roadmap for AI Governance in Canada. Part II: Data, Policy & Innovation. IP Osgoode, Osgoode Hall Law School, York University. March 21, 2019, Toronto Reference Library, Toronto, Canada.

“The ethical implications of technical limitations” (invited talk). Fairness, Accountability & Transparency/Asia, Digital Asia Hub and ACM/FAT*. January 12, 2019, Shun Hing College, University of Hong Kong, Hong Kong.

“Machine learning for social scientists.” Fairness, Accountability & Transparency/Asia, Digital Asia Hub and ACM/FAT*. January 11, 2019, Shun Hing College, University of Hong Kong, Hong Kong. [Slides]

“‘AI’ is a lie: Getting to the real issues.” AGTech Forum, Berkman Klein Center for Internet & Society at Harvard University. December 13, 2018, Cambridge, Massachusetts. [Slides]

“Theorizing sensors for social network research” (invited talk). Computational Social Science Institute, UMass Amherst. December 7, 2018, Amherst, Massachusetts. [Slides]

“What everyone needs to know about ‘prediction’ in machine learning” (invited talk). Leverhulme Centre for the Future of Intelligence, University of Cambridge. December 3, 2018, Cambridge, UK. [Slides]

“Anxiety, crisis, and a computational future for journalism.” Philip Merrill College of Journalism / College of Information Studies, University of Maryland. November 27, 2018, College Park, Maryland.

“Networks, yeah! The representation of relations” (invited talk). Data & Donuts, DigitalHKS, Harvard Kennedy School, Harvard University. November 2, 2018, Cambridge, Massachusetts.

“Demystifying AI: Terms of disservice.” AI Working Group, Berkman Klein Center for Internet & Society. October 28, 2018, Cambridge, Massachusetts.

“Surprising aspects of “prediction” in data science.” 0213eight, Harvard Alumni Association. October 13, 2018, Cambridge, Massachusetts.

“From the forest to the swamp: Modeling vs. implementation in data science” (invited talk). Techtopia @ Harvard University. October 2, 2018, Cambridge, Massachusetts.

Thesis defense: Bias and beyond in digital trace data. Institute for Software Research, School of Computer Science, Carnegie Mellon University. August 9, 2018, Pittsburgh, Pennsylvania. [Slides]

“Friendship and proximity in a fraternity cohort with mobile phone sensors.” XXXVIII Sunbelt Conference of the International Network for Social Network Analysis. Modeling network dynamics (ses15.05). July 1, 2018, Utrecht, Netherlands. [Slides]

“A critical introduction to statistics and machine learning.” Cascais Data Science for Social Good Europe Fellowship, Nova School of Business and Economics, Universidade NOVA de Lisboa. August 15, 2017, Cascais/Lisbon, Portugal. [Part I Slides] [Part II Slides]

“A social scientist’s guide to network statistics” (guest lecture). 70/73-449: Social, Economic and Information Networks, Fall 2016 (Instructor: Dr. Katharine Anderson). Undergraduate Economics, Tepper School of Business, Carnegie Mellon University. November 10, 2016, Pittsburgh, Pennsylvania. [Slides]

“Platform effects in social media networks.” 2nd Annual International Conference on Computational Social Science. Social Networks 1. June 24, 2016, Evanston, Illinois. [Slides]

“Identifying platform effects in social media data.” Tenth International AAAI Conference on Web and Social Media (ICWSM-16). Session I: Biases and Inequalities. May 18, 2016, Cologne, Germany. [Slides]

“Social media data and computational models of mobility: A review for demography.” 2016 ICWSM Workshop on Social Media and Demographic Research (ICWSM-16 SMDR). May 17, 2016, Cologne, Germany. [Slides]

“Platform effects in social media networks.” XXXVI Sunbelt Conference of the International Network for Social Network Analysis. Social Media Networks: Challenges and Solutions (Sunday AM2). April 10, 2016, Newport Beach, California. [Slides]

“A social scientist’s guide to network statistics (presented to statisticians).” stat-network seminar, Department of Statistics, Carnegie Mellon University. March 25, 2016, Pittsburgh, Pennsylvania. [Slides not public, see these slides for the same content.]

“Ethical and policy issues in predictive modeling” (guest lecture). 08-200/08-630/19-211: Ethics and Policy Issues in Computing, Spring 2016 (Instructor: Professor James Herbsleb). Institute for Software Research, School of Computer Science, Carnegie Mellon University. March 1, 2016, Pittsburgh, Pennsylvania. [Slides]

“Population bias in geotagged tweets”. 2015 ICWSM Workshop on Standards and Practices in Social Media Research (ICWSM-15 SPSM). May 26, 2015, Oxford, UK. [Slides]

“Inferring social networks from sensor data”. XXXIV Sunbelt Conference of the International Network for Social Network Analysis. Network Data Collection (Saturday AM2). February 22, 2014, St Pete Beach, Florida. [Slides]

Acknowledged in

I try to properly acknowlege people who contribute to my work, and conversely am proud to be found in the acknowledgements of the following works:

Apryl Williams. 2024. Not my type: Automating sexual racism in online dating. Stanford, CA: Stanford University Press.

Sireesh Gururaja, Amanda Bertsch, Clara Na, David Gray Widder, and Emma Strubell. 2023. To build our future, we must know our past: Contextualizing paradigm shifts in Natural Language Processing. [Preprint]

Barbara Kiviat. 2023. The moral affordances of construing people as cases: How algorithms and the data they depend on obscure narrative and noncomparative justice. Sociologcal Theory. doi: 10.1177/07352751231186797. [IEEE link]

Ben Green. 2021. Data science as political action: Grounding data science in a politics of justice. Journal of Social Computing 2 (3): 249–265. doi: 10.23919/JSC.2021.0029. [IEEE link]

Jonnie Penn. 2021. Algorithmic silence: A call to decomputerize. Journal of Social Computing 2 (4): 337–356. doi: 10.23919/JSC.2021.0023. [IEEE link]

Chelsea Barabas, Audrey Beard, Theodora Dryer, Beth Semel, and Sonja Solomun. 2020. Abolish the #TechToPrisonPipeline. Coalition for Critical Technology, June 22. [Letter website]

Dariusz Jemielniak. 2019. Socjologia Internetu (in Polish). Warszawa: Wydawnictwo Naukowe Scholar. [Publisher website] [Sample content and reference list from author]

Keiki Hinami, Michael J. Ray, Kruti Doshi, Maria Torres, Steven Aks, John J. Shannon, and William E. Trick. 2019. Prescribing associated with high-risk opioid exposures among non-cancer chronic users of opioid analgesics: A social network analysis. Journal of General Internal Medicine 34: 2443–2450. doi: 10.1007/s11606-019-05114-3. [Springer link (paywall)] [PubMed record (abstract only)]

Viktor Mayer-Schönberger and Kenneth Cukier. 2013. Big Data: A revolution that will transform how we live, work, and think. Boston and New York: Eamon Dolan/Houghton Mifflin Harcourt. [Book website]

Mary Madden, Amanda Lenhart, Sandra Cortesi, Urs Gasser, Maeve Duggan, Aaron Smith, and Meredith Beaton. 2013. Teens, social media, and privacy. Pew Internet & American Life Project. [Report website]

Press, quotes, and commentaries/editorials

Quotes from me, or notable coverage/mentions of my work:

Will Knight. 2022. Sloppy use of machine learning is causing a ‘reproducibility crisis’ in science. Wired, August 10. [Wired link]

Elizabeth Gibney. 2022. Could machine learning fuel a reproducibility crisis in science? ‘Data leakage’ threatens the reliability of machine-learning use across disciplines, researchers warn. Nature 608: 250–251. doi: 10.1038/d41586-022-02035-w. [Nature link]

Scottie Andrew. 2021. Reparations for slavery could have reduced Covid-19 transmission and deaths in the US, Harvard study says. CNN, February 16. [CNN link]

Wendy Hui Kyong Chun and Jorge Cottemay. 2020. Reimagining Networks: An interview with Wendy Hui Kyong Chun. The New Inquiry, May 12. [New Inquiry link]

Susan Cassels and Sigrid Van Den Abbeele. 2021. A call for epidemic modeling to examine historical and structural drivers of racial disparities in infectious disease [Commentary on “Reparations for Black American descendants of persons enslaved in the U.S. and their potential impact on SARS-CoV-2 transmission”]. Social Science & Medicine 276: 113833. doi: 10.1016/j.socscimed.2021.113833. [Science Direct link]

Bob Franklin. 2016. The future of journalism: Risks, threats, and opportunities [Mention of “A macroscopic analysis of news content on Twitter”]. Journalism Practice 10 (7): 805–807. doi: 10.1080/17512786.2016.1197640. [Taylor & Francis link]

Reviewing, organizing, and program committees

I was a Senior PC member for the International AAAI Conference on Web and Social Media (ICWSM) in 2020

I was Sponsorship Chair for the 14th International Conference on Web and Social Media (ICWSM-2020), Atlanta, Georgia, June 8–June 11, 2020.

I was an Editorial Board member for the 2019 special issue on “Critical Data and Algorithms Studies” in Frontiers in Big Data Data, Mining and Management (Frontiers Media S.A.).

I was co-organizer of the Workshop on Critical Data Science at 13th International Conference on Web and Social Media (ICWSM-2019), Munich, Germany, June 11, 2019.

I was posters co-chair for the 11th International ACM Web Science Conference 2019 (WebSci ’19), Boston, Massachusetts, June 30–July 3, 2019.

I have done peer review for:

Contact

I may be reached at gmail (my first name dot my last name).

This website is my primary online presence, but I maintain profiles elsewhere as well: