Kareem Carr

		Job
	Nando de Freitas	Chercheur chez Deepind
	Nige Willson	Conférencier
	Ria Pratyusha Kalluri	Chercheur, MIT
	Ifeoma Ozoma	Directrice, Earthseed
	Will Knight	Journaliste, Wired
	Dr Kate Crawford	Chercheur, Microsoft Professeur, Université de New York
	Justin Hendrix	Chercheur, NYU Tandon School of Engineering CEO
	Jenn Wortman Vaughan	Chercheur, Microsoft
	Dr Mona Sloane	Chercheur, Université de New York Sociologue
	Kathy Baxter	Architecte Ethique, SalesForces
	Amba Kak	Directeur, AI Now Institute
	Nico Grant	Journaliste, Bloomberg
	Madeleine Clare Elish	Chercheur, Google
	Frank Pasquale	Chercheur, Université du Maryland
	Emily Denton	Chercheur, Google Brain
	Scott Thurm	Journaliste, WIred
	Kyle M L Jones	Chercheur, Université d' Indianapolis
	Alessandro Bongioanni	Chercheur, Université d'Oxford
	Ayanna Howard	Presidente, School of Interactive Computing au Georgia Tech Chercheur Auteur
	Michael Veale	Conseiller, Open Rights Group
	Meredith Broussard	Professeur, Arthur L. Carter Journalism Institute NYU Chercheur Auteur
	Arvind Narayanan	Professeur, Princeton Chercheur
	Solon Barocas	Chercheur, Microsoft Fondateur, FAcct
	Sergey Levine	Chercheur, Berkeley Professeur
	Jurgen Schmidhuber	Chercheur, NNAISENSE Professeur, Dalle Molle Institute for Artificial Intelligence Research
	Jascha Sohl-Dickstein	Chercheur, Google Brain
	Niloufar Salehi	Professeur, Berkeley Chercheur
	Veena Dubal	Professeur, Californie University
	Tom Simonite	Journaliste, Wired
	Shalini Kantayya	Cinéaste
	Kareem Carr	Biostatistician, Harvard
	Devin Guillory	Chercheur, Berkeley
	Jack Clark	Directeur Politique, Open AI
	Cade Metz	Journaliste, New York Times Auteur
	Yeshimabeit Milner	Data Scientist, Highlander Research
	Nicolas Le Roux	Professeur, McGill University Chercheur, Google Brain
	Julia Angwin	Journaliste, The Markup
	Ryan Mac	Journaliste, Buzzfeed
	Vijay Chidambaram	Professeur, University duTexas Chercheur, VMware Research
	Michael Ekstrand	Chercheur, Boise State University
	Casey Fiesler	Professeur, Université de Boulder
	Mar Hicks	Professeur, iIllinois Institute of Technology
	Gideon Lichfield	Directeur Editorial, Wired
	William Isaac	Chercheur, Deepmind
	Cathy O Neil	Mathématicienne Data Scientist Auteur
	Luke Stark	Professeur, Université d'Ontario Auteur
	Talia Ringer	Professeur, Washington University
	Khadijah Abdurahman	Chercheur
	Tawana Petty	Directrice, Data for Black Lives Auteur
	Khari Johnson	Journaliste, Venture Bit

Plus

Profil AI Expert

Nationalité:

Américain(e)

AI spécialité:

Science des données

Occupation actuelle:

Biostatistician, Harvard

Taux IA (%):

44.16'%'

Twitter:

https://twitter.com/kareem_carr

TwitterID:

@kareem_carr

Tweet Visibility Status:

Public

Description:

Kareem a toujours eu un intérêt large et éclectique pour l'utilisation du calcul et des mathématiques pour faire de la science. Kareem a travaillé comme bioinformaticien à Harvard et biologiste informatique au Broad Institute et parallèlement, consultant en science des données à l'Institut des sciences sociales quantitatives de Harvard, où il a participé à plus de 100 projets en sciences sociales et enseigné des ateliers de programmation à des étudiants de Harvard et du MIT. Son principal intérêt de recherche porte sur le rôle des statistiques dans la production de connaissances. Il pense qu'il est utile de distinguer la reproductibilité (la science de celui-ci) et la replicabilité (le logiciel numérique qui en résulte). Il est très impliqué dans les débats lié à l'IA sur internet, notamment avec les experts Judea Pearl et Danilo Bzdok.

Reconnu par:

Non Disponible

Les derniers messages de l'Expert:

Tweet list:

2024-03-01 00:00:00 CAFIAC FIX

2024-03-11 00:00:00 CAFIAC FIX

2023-05-22 20:48:50 RT @kareem_carr: WHY do we divide by n-1 when computing the sample variance? I've never seen this way of explaining this concept anywhere…

2023-05-22 15:22:46 RT @cam_o_gram: I think I have a crush on this dude... this is excellent.

2023-05-22 14:56:14 I enjoy explaining math and statistics ideas. Follow me for more content like this, and don't forget to click the little notification bell so you don't miss out on future threads. https://t.co/V2wY54dqaZ

2023-05-22 14:56:13 This isn't the whole story. There is one more twist of mathematical luck that makes the algebra work out. But this is the main idea. I hope this makes the appearance of n-1 feel less mysterious.

2023-05-22 14:56:12 IDEA: BESSEL'S CORRECTION CANCELS THE CORRELATION FACTOR Notice that the correlation factor and Bessel's correction cancel each other out when multiplied. So that's the story of where the Bessel's correction comes in and why we divide by n-1. https://t.co/qGmbxKI24B

2023-05-22 14:56:11 This applies to every other observation not just the first. As you can imagine, recomputing the average of the n-1 remaining observations for each observation is tedious. It's much easier to subtract the same sample mean each time and then account for the correlation afterwards. https://t.co/L0in8f4VrN

2023-05-22 14:56:09 IDEA: WE DON'T NEED TO ACTUALLY DECORRELATE. WE CAN JUST USE A CORRECTION FACTOR Subtracting the sample mean from the first observation is identical to subtracting the average of all the values excluding the first observation times an extra correlation factor. https://t.co/De4KVAgPeg

2023-05-22 14:56:07 IDEA: DECORRELATING THE VALUES WITH ALGEBRA I will use the first observation as an example. STEP 1: We rearrange the terms so the mean no longer contains the first observation. STEP 2: We rearrange the remaining expression to involve the average of the remaining n-1 values https://t.co/71EFP2Mg9L

2023-05-22 14:56:05 The way I like to think about it is we're subtracting -1/n of the observation when we subtract the sample mean. Since we do this n times for each observation, n times -1/n equals 1. We are effectively subtracting 1 observation. This is why we effectively have n-1 observations.

2023-05-22 14:56:04 IDEA: THE SAMPLE MEAN IS NOT INDEPENDENT OF OUR OBSERVATIONS Each observation and the sample mean are slightly correlated because the sample mean is computed using all the observations. https://t.co/3EW589pVJs

2023-05-22 14:56:02 INSIGHT 2: We can think of the sample variance as computing the average distance to the sample mean but with an extra correction factor. Our question then changes from "Why divide by n-1?" to "Where did the correction factor come from?" https://t.co/i6q5gG9Wxh

2023-05-22 14:56:00 Here are two key insights which will be important later. INSIGHT 1: Notice that in the formula for the sample variance, we are subtracting the sample mean from each observation. https://t.co/J9aUD6MgTo

2023-05-22 14:55:59 We should also quickly review the "sample mean" or "sample average". If you are comfortable with this concept, skip ahead to the next tweet. We compute the sample mean by adding up all our observations and then dividing by the total number of observations. https://t.co/Hls6v0Iyuh

2023-05-22 14:55:57 BACKGROUND This explanation is going to be confusing if you're rusty on summation notation. So here is a quick review. If you're comfortable with this concept, skip to the next tweet. Summation notation is a compact way of talking about adding up n values. https://t.co/394ErY21WB

2023-05-22 14:55:55 WHY do we divide by n-1 when computing the sample variance? I've never seen this way of explaining this concept anywhere else. Read on if you want a completely new way of looking at this. https://t.co/e4hjBYubRC

2023-05-21 08:56:16 RT @kareem_carr: philosophers: the only thing i know is that i know nothing data scientists: the only thing i know is that i know nothing…

2023-05-19 19:00:00 CAFIAC FIX

2023-05-21 19:00:00 CAFIAC FIX

2023-05-05 10:17:53 @MaartenvSmeden @IAmSamFin @AndrewLBeam Would it be more fair to say “methods” shouldn’t be dichotomized by “methodologies” can be? Meaning I could use K-means in a machine learning framework for your analysis (adhering to the scientific norms of the ML community) or in a statistical framework?

2023-05-05 10:00:52 RT @kareem_carr: This is the greatest mathematician story ever told https://t.co/lCoS68lEAk

2023-05-05 09:53:40 @MaartenvSmeden @IAmSamFin @AndrewLBeam I guess I'm a little skeptical the dichotomy is really "false". I could see a path to a synthesis in the specific context of medical research but that's because both subgroups would be subordinate to the research norms of the medical research community in that context.

2023-05-05 09:50:33 @MaartenvSmeden @IAmSamFin @AndrewLBeam I'm sympathetic to the overall goal of moving past dichotomization! But how would you respond to the fact that sociologically these are different groups: do different degrees, know different things, attend different conferences, adhere to different standards of rigor?

2023-05-05 02:37:02 RT @kareem_carr: This is the greatest mathematician story ever told https://t.co/lCoS68lEAk

2023-05-04 23:43:06 @truthy_ty It is a very common experience among mathematicians to deeply want to talk about math in situations where they know they should not be talking about math.

2023-05-04 23:10:52 This is the greatest mathematician story ever told https://t.co/lCoS68lEAk

2023-05-04 18:29:25 @krichard1212 @kalle_leppala Thanks. I really appreciate this. I’ll DM you my email.

2023-05-04 17:53:31 @SciencePartisan @ent3c This is my confusion as well.

2023-05-04 17:40:20 @ent3c I think IQ is a valid construct as well but significantly more limited than the most enthusiastic proponents imply.

2023-05-04 16:17:51 @kalle_leppala @DialecticBio I don’t think so. No. But let me put it this way. if a guy calls a black person the n-word and someone’s reaction is “yeah. well, what was the n-word doing?” then that someone might not be a very neutral party. I’m definitely not going to waste my time trying to win them over.

2023-05-04 16:02:49 @DialecticBio @kalle_leppala I mean there’s racism and then there’s dropping the n-word. He’s basically saying it’s reasonable to call a black person the n-word if you feel like they are committing the super duper serious crime of being wrong on the internet.

2023-05-04 15:49:21 @krichard1212 @kalle_leppala These are some really interesting observations. Would love to pick your brain regarding references at some point.

2023-05-04 15:42:00 @kalle_leppala Documented proof that I have heard of the central limit theorem. https://t.co/rFNlw1WVva

2023-05-04 14:18:10 Yeah Sex is cool but have you ever asked a mathematician to show you how to prove something that he said was "trivial" and then watched him struggle at the board for twenty excruciatingly awkward minutes before finally giving up?

2023-05-04 13:36:46 RT @kareem_carr: I woke up to this. Whenever I tweet about IQ, no matter how technical my critique, I’m attacked for my race. People ass…

2023-05-03 15:12:43 I will conclude by saying I am sorry for putting this ugly image on your timeline but I think many people won’t believe the level of racism I have to deal with unless they see it for themselves.

2023-05-03 15:12:42 It’s a long story but I started school two years early which led to me constantly feeling awkward and out of step with everybody.

2023-05-03 15:12:41 I don’t talk about this part of my life much. I guess this meme sums up how I feel about it. Some of you are probably thinking “Burned out disappointment? Bro, you’re at Harvard.” I know. I know. I don’t have an answer for you. That’s just how I feel. https://t.co/pXucB7Mm8S

2023-05-03 15:12:40 These are my “state scholar” trophies. My country has two parts to high school. People typically graduate at about 16 and then again at 18. I was the best student in the country both times. https://t.co/eETdGxJq0o

2023-05-03 15:12:38 Back home we take these international standardized tests that are run through Cambridge University. I graduated highschool at the top my class with the highest exam results not just in my year but the highest grades that had ever been recorded for our country in decades. https://t.co/EKSGlkwbCJ

2023-05-03 15:12:37 The truth is, I’m from a small country of just 50,000 people, and chances are these racists have literally never met anybody like me. They presume to know me while lacking any frame of reference. Even within my own culture, I am a huge weirdo.

2023-05-03 15:12:36 In my darker moments, I fear that many will find these attacks plausible because it plays into pervasive stereotypes about black people.

2023-05-03 15:12:27 I woke up to this. Whenever I tweet about IQ, no matter how technical my critique, I’m attacked for my race. People assert without evidence that my IQ is low, that I’m an affirmation action candidate, that my credentials are fake, that I’m bad at math. I am called slurs. https://t.co/SKoweuG6IX

2023-05-03 14:17:23 @ent3c I will read your thread because I always want to know where I'm wrong. For instance, I'll update my future discussions of the normality assumption based on what you say here. But I feel misrepresented in this discussion so I won't respond to specific points raised at this time.

2023-05-03 14:12:01 @ent3c This feels unfair since my comments are related to what you tweeted yesterday not what you are tweeting today. I also think you are ignoring my main point about normality being an assumption which was my actual point in favor of nitpicking. https://t.co/cjTdNx7eUs

2023-05-03 13:44:31 @drubanov Who’s been saying it’s snobbish? Just curious. DM me if you don’t want to call anybody out.

2023-05-03 11:26:28 RT @kareem_carr: If you’ve ever wondered how mathematicians come up with such clever arguments, I strongly recommend “How to Prove It” It’…

2023-05-03 07:59:29 RT @kareem_carr: If you’ve ever wondered how mathematicians come up with such clever arguments, I strongly recommend “How to Prove It” It’…

2023-05-03 07:59:05 @love2laugh4ever Thanks!

2023-05-02 15:39:03 @dylanarmbruste3 Yes. I like to invest in hard copies for books that I feel are foundational and that I expect to reread in the future.

2023-05-02 15:28:46 @ent3c I don't see the relevance to anxiety scales. Do you think I think IQ is bullshit? Because that's not what I'm saying. I think IQ is an operationalization of particular group's understanding of intelligence which is fine but other groups might have a different understanding.

2023-05-02 15:25:16 @ent3c 2. "It's tautologically true that IQ is at minimum the ability to do well on IQ tests." Surely we can agree on that as a baseline? The debate is about whether it's more in addition to that.

2023-05-02 15:15:18 @ent3c I think me have a disconnect here. Let me try and frame it. I'm saying two things: 1. "Normality is an assumption." I think you think I'm saying it's an "unreasonable" assumption" and so you're arguing it's a "reasonable" one. I simply want people to know it's an *assumption*.

2023-05-02 14:56:30 @ent3c Maybe I’m missing something. Happy to DM about it if you don’t want to be stuck going back and forth 280 characters at a time.

2023-05-02 14:54:05 @King_Mamadelo I think it’s extremely helpful for people in a math degree but pretty readable for a general audience as well.

2023-05-02 14:52:15 @ent3c But turning opinions into math shouldn’t let us off the hook from having to defending such consensus opinions.

2023-05-02 14:46:59 @ent3c Second, I think we agree that reasonable people might come to a consensus on what anxiety or aggression or intelligence means and then mathematize that as a measure. I have no beef with that.

2023-05-02 14:38:48 @ent3c Not sure why this is phrased as “pushback”. We agree that “the assumption of normality is statistical”. The key word here is “assumption”. People think the normality arises from the biology or the nature of intelligence which is wrong.

2023-05-02 14:32:04 If you like the thread then follow me for more content like this, and don't forget to click the little notification bell so you don't miss out on future threads. https://t.co/hLODGeQHWP

2023-05-02 14:32:03 At the end of this book, you will be able to understand statements like the one below and how to use them to design strategies for writing mathematical proofs. https://t.co/geBi0tSIYi

2023-05-02 14:32:01 It even covers what “and”, “or” and “if” mean in a mathematical context. (Not as straightforward as you might think.)

2023-05-02 14:32:00 If you’ve ever wondered how mathematicians come up with such clever arguments, I strongly recommend “How to Prove It” It’s an extremely gentle introduction that starts with the absolute basics and eventually teaches you how to construct a mathematical argument or “proof”. https://t.co/GqZxPGYAZN

2023-05-02 11:57:48 RT @kareem_carr: The perception of IQ as a seemingly objective measure of intelligence is frequently used to promote racist pseudoscience o…

2023-05-02 09:07:49 RT @kareem_carr: The perception of IQ as a seemingly objective measure of intelligence is frequently used to promote racist pseudoscience o…

2023-05-02 02:22:45 RT @ThosVarley: People (esp. blue checks) get *really weird* about IQ. Post even a mild critique how IQ is discussed and guys will come ou…

2023-05-01 23:34:08 @n00rdung by*

2023-05-01 23:27:12 @nianello6 https://t.co/Hvf12XnNUc

2023-05-01 23:26:03 @n00rdung That’s what many people mistakenly think. That’s why so many misunderstand the shape of the IQ distribution as somehow being natural. It’s not. It’s forced into that shape my mathematical manipulations. See? No CLT involved. https://t.co/2dUxrU3VFp

2023-05-01 23:01:36 RT @TinaGower: This is a wonderful thread. When I’m giving the IQ I explain this to parents. IQ isn’t measuring intelligence. It’s a guess…

2023-05-01 22:22:42 @ScottAdamsSays Also correlation assumes a linear model and can be hugely inflated by just a few values if that's wrong.

2023-05-01 22:22:08 @ScottAdamsSays One of the main ideas is if an extremely low IQ score can detect severe medical dysfunction in a patent, and I don't dispute that it can, then the correlations can be strongly driven by that.

2023-05-01 22:17:24 @ScottAdamsSays I would recommend your followers read this: https://t.co/PfUwA2bj2N It requires a bit statistical background unfortunately but I'm planning to break it down in a future thread if they want to follow me.

2023-05-01 21:44:33 @sbkaufman Hey, Scott. Yeah, we should set that up. I know we DMed in the past about doing that at some point. I’ll DM you!

2023-05-01 19:44:11 @jayjoseph22 Dang it. I should have asked you for this. I was trying to find it. Much better illustration.

2023-05-01 18:42:29 @datepsych Just to pick one point: There is no major "IQ isn't real" debate in cognitive science.” You just kind of state that. Are we just supposed to accept that on your authority as an anonymous account on the internet? ¯\_()_/¯

2023-05-01 18:39:36 @datepsych No offense but this is kind of a gish gallop. I made arguments supporting my points and you’re just dumping a bunch of unsupported statements which obviously can’t be fully discussed without a huge rebuttal thread in the comments to my already long initial thread. https://t.co/oGRuRi86yP

2023-05-01 16:40:26 If you liked this thread follow me for more content like this. If you haven’t already, don't forget to click the little notification bell so you don't miss out on future threads. https://t.co/6kVWFTjGRQ

2023-05-01 16:40:24 This is because identifying students that *might* need help is a vastly easier task than constructing a completely objective and universal measure of human intelligence. Also "standardize test taking ability" is vastly more relevant to school than it is to life in general.

2023-05-01 16:40:23 I personally think that IQ can be a helpful when used for its original purpose of identifying underperforming students who might need some extra attention.

2023-05-01 16:40:22 In my opinion, I think we would all be better off if we thought of IQ not as "intelligence" but as a measure of "standardized test taking ability".

2023-05-01 16:40:21 This seems circular to me. If there was a new test of cognitive ability that largely didn't correlate with the others, they would exclude it or at the very least not weight it as strongly as the other tests.

2023-05-01 10:07:17 RT @kareem_carr: As a statistician, it is extremely frustrating to me to see an account called “World of Statistics” with over 1.5M followe…

2023-04-30 16:45:36 @quaesita Congrats Cassie! You're killing it.

2023-04-29 18:40:11 @FelixKreuk @KevnSPas Yes. That was exactly my thinking.

2023-04-29 17:25:27 @AnrothanN This guy? https://t.co/x3jjNxnRLo

2023-04-29 17:15:26 @LeetAlpaca3 What scientific question are you trying to answer and why would comparing the IQ of different countries be the best way to answer it? Countries are highly artificial constructs.

2023-04-29 16:51:31 @ranelagh75 They are using a white reference population as a baseline for Japanese. Using one population as a baseline for another is typically considered a misuse of the test and the interpretation of such results is highly disputed.

2023-04-29 16:36:30 @mdhunstiger Exactly. So many threads to pull on here. The precision to two decimal places is a huge red flag.

2023-04-29 16:11:37 Spreading these flawed data as fact is indirectly supporting white nationalism. If you want to follow an account on here that talks about statistics and data science regularly and isn’t going to be sharing any racist pseudoscience, follow me.

2023-04-29 16:11:36 The discussion of national IQ seems plausibly innocent and agenda-free but quickly leads to dark places. Unsurprisingly, Lynn has one book where he basically constructs a racial hierarchy of intelligence. https://t.co/BLIDn6CsnY

2023-04-29 16:11:34 The source of these national data is Richard Lynn who does not seem to be a neutral scientist. He’s been quoted saying things consistent with a white nationalist agenda. https://t.co/YZ5UGdET6G

2023-04-29 16:11:33 The European Human Behavior and Evolution Association formally recommends not using the data at all. https://t.co/KYJ2frfvKw

2023-04-29 16:11:32 Many, many scientists have questioned the methodology of the data collection. https://t.co/Mvuax2dNN2

2023-04-29 16:11:30 Even when it does use real data, it’s often in a sloppy and irresponsible way. For instance, basing the IQ of Equatorial Guinea on children from a home for the mentally disabled in Spain. https://t.co/iKyyVH4BNW

2023-04-29 16:11:29 The first thing we need to address is this extremely bad data. More than half of it is essentially made up. 105 of the 185 data points are guesstimates. https://t.co/hRrLQs147q

2023-04-29 16:11:27 The claim that the average Nigerian has an IQ of 67.8 is absolutely ridiculous. Such an IQ would imply severe cognitive impairment. And that is just their estimate of the average! Half of the distribution would be expected to have an IQ lower than that. https://t.co/N4cm9zMBSr

2023-04-29 16:11:25 This kind of pseudoscience exploits a deep cognitive bias that we humans have. We are willing to believe nonsensical “facts” about the human nature of out-groups that we would immediately see as nonsensical if it was said about our in-group.

2023-04-29 16:11:24 If you dig into these data even a little bit, it’s immediately obvious how nonsensical it all is. If you dig even deeper, what you find is bigotry and fraud.

2023-04-29 16:11:10 As a statistician, it is extremely frustrating to me to see an account called “World of Statistics” with over 1.5M followers spreading this pseudoscientific garbage. Statistics requires us to think critically about our data. This is *not* statistics. https://t.co/ANl8cVPgQT

2023-04-29 14:23:45 @LeeJenson1 It’s not 50-50. The chances are generally much smaller and not really independent either. But we know there are a very large number of gene variants that contribute only a little to height and they are probably only weakly dependent. There’s a version of the CLT that covers that.

2023-04-29 02:52:31 @DrShariEllen I know there are some on amazon like this one: https://t.co/yX71nHk74B

2023-04-28 14:22:50 @skdh I think GPT-4 can already pass it quite easily with the right adjustments.

2023-04-28 14:02:08 do you follow the creed? https://t.co/xEpokD1UQ3

2023-04-28 09:25:39 RT @kareem_carr: I’m in the good place https://t.co/p1TK2FDpde

2023-04-28 02:22:47 @HereContrarian @ETVPod @matthematician We are going to have to agree to disagree. Take it easy.

2023-04-27 23:39:21 @HereContrarian @ETVPod @matthematician When people are accusing you of being an operative for the chinese government and of trying to start a race war, I think it’s absolutely fair to shift the focus of the discussion from the philosophy of math to the “bad ideological actors” participating in the conversation.

2023-04-27 22:50:50 @ryxcommar kareemcarr[.]bsky[.]social

2023-04-27 22:06:17 @tjmahr Good to know. Thanks!

2023-04-27 21:25:02 @tjmahr Whoa!!! is this a base R command? If so, what version?

2023-04-27 17:04:03 I feel like the main lesson statisticians should be taking from machine learning is being willing to spend tens of millions of dollars on a single model will get you some damn good statistical models.

2023-04-27 16:25:59 Sorry. Still a newbie. No invites yet.

2023-04-27 16:25:58 I’m in the good place https://t.co/p1TK2FDpde

2023-04-27 15:45:00 I'm becoming increasingly concerned about the number of people who think "talking like a person" is equivalent to "being a person". I can't tell if this means they have a really low opinion of humans or really high opinion of algorithms.

2023-04-27 15:12:23 Linear regression is a foundational human discovery. The ability to fit a line to a set of data points is an essential tool in the toolbox of every scientist and engineer.

2023-04-27 09:20:54 @TheOutsiderHum1 @matthematician @ETVPod My attitude was like "here's another way to think about it". I definitely didn't harass anybody or question their intelligence. A lot of the responses to me were like "you are a bad person and the stuff you're saying is a danger to Western civilization". It was unhinged.

2023-04-27 08:26:27 RT @kareem_carr: I’m choosing violence today. These lists are biased towards physics. The linear regression equation should at least be 4th…

2023-04-26 23:54:41 RT @kareem_carr: We now have data demonstrating my tips for improving prompts to GPT-4 work at least in this one case! Telling GPT-4 it…

2023-04-26 23:40:24 @ETVPod @matthematician Defensively phrasing everything you say in an attempt to win an ideological war with the right doesn’t sound great to me. Assuming one even wants to participate in that struggle which many don’t. Besides the most partisan rightwingers will happily just invent “facts” if needed.

2023-04-26 20:48:34 @miclugo @LadySynaptic On the surface, this is fair but for me personally, the *convergence* to the Normal is what feels salient and most beautiful to me.

2023-04-26 16:03:31 @LadySynaptic That would be awesome. My hope is I can find a way to support myself financially while having lots of time to do cool stuff like that.

2023-04-26 16:00:31 @miclugo @LadySynaptic I feel like the central limit theorem has to be on there somewhere. It's so fundamental that it's almost a law of nature. Human height is approximately Gaussian for instance. (I don't know. I'm probably biased.)

2023-04-26 15:50:43 @nolightupstairs @LadySynaptic that doesn't seem right. fake news.

2023-04-26 15:48:44 @LadySynaptic In my defense, I was blinded by rage.

2023-04-26 15:42:25 Normal distribution is OK but the people demand more!!!

2023-04-26 15:42:24 I’m choosing violence today. These lists are biased towards physics. The linear regression equation should at least be 4th or 5th in terms of impact! https://t.co/hs3UN1sVTf

2023-04-26 09:11:56 @ent3c @kph3k Thanks.

2023-04-25 22:15:57 @ent3c Of course, any money going to that would not be going to education thus confounding. Seems like a planet-sized flaw to me.

2023-04-25 22:15:42 @ent3c I’m sure one could easily list a hundred likely heritable ailments that would limit educational attainment mainly due to accessibility issues (cancers, metabolic diseases, sickle cell, etc, etc).

2023-04-25 21:50:58 @sama It’s been a real uphill battle to convince other academics to use ChatGPT because of the hallucinations. I’ve been trying to figure out good prompt engineering strategies for academics. Would really appreciate access to the GPT-4 API to help my efforts along. https://t.co/gznQLReUis

2023-04-25 21:43:03 @ent3c So…do they have a way of addressing this in these analyses? It would obviously lead an over estimate of the “heritability” that we actually care about.

2023-04-25 20:39:47 @ent3c The genetic variants associated with the disease affect my behavior but only in the sense they physically limit what I can do with my body. (You can imagine a variant of this where skin color affects educational access so of course skin color genes would be associated with EA).

2023-04-25 20:33:46 @ent3c Imagine I have a very serious heritable disease that doesn't affect my cognition but has huge affects on my ability to function in society. This would obviously be a genetic determinant of my educational attainment but in the most boring way. How do they account for this?

2023-04-25 16:23:58 RT @kareem_carr: ChatGPT consistently gets this very basic question wrong. Does that mean ChatGPT is useless? Not necessarily! Using this…

2023-04-25 14:31:28 @ResearchChat @Teknofiliac This is really interesting. I am tried out your prompt and it seems like "give me x examples" where x is 2 or more is the phrase that is causing it to give the right answer. Fascinating. I don't have a theory for why this works (yet).

2023-04-25 08:49:01 @jasonaholliday Here’s how I think about it: https://t.co/IKF22MP4vs

2023-04-25 08:47:55 @chunderboolt My interest is in using it as a tool. Less interested in the philosophical questions of whether it’s truly “intelligent”. I just see it as fancy math combined with lots of data.

2023-04-25 08:42:12 @emaedi0ng I didn’t make the plot but it’s R. The ggplot2 package.

2023-04-25 08:41:01 RT @kareem_carr: We now have data demonstrating my tips for improving prompts to GPT-4 work at least in this one case! Telling GPT-4 it…

2023-04-25 01:19:49 @Teknofiliac Wow. That's the most successful attempt I've ever seen. What do you think is going on here?

2023-04-24 23:08:01 Thanks to @colin_fraser for collecting the data and to @teej_m for providing GPT-4 access. Colin and I don't have access. Help us out @OpenAI and @sama . (Sample sizes in each case were 100.)

2023-04-24 23:08:00 We now have data demonstrating my tips for improving prompts to GPT-4 work at least in this one case! Telling GPT-4 it was more competent increased the success rate from 35% to 92% Giving GPT-4 a strategy for completing the task increased it from 26% to 54% https://t.co/fqyNO6S9oq https://t.co/AU9fkBYeA4

2023-04-24 23:00:07 @colin_fraser @teej_m Thanks!

2023-04-24 22:59:18 @dolohov Telling it to write out the solution to the problem step by step is an easier word prediction task. The logic of each step is easier to predict versus doing it all in one shot. Also tasks that resemble the simple steps are more likely to be in the training data somewhere.

2023-04-24 22:54:58 @dolohov It learns the probability of the next word given the previous words. Telling GPT-4 that it's an expert, influences it to model words from experts which are more like to be right vs the default which is words from a generic human.

2023-04-24 22:30:28 @colin_fraser @teej_m What are the numbers in the plot for GPT-4? I can sort eyeball it but would be good to get precise numbers.

2023-04-24 22:18:28 @colin_fraser @teej_m Quick question. What are the sample sizes?

2023-04-24 22:17:23 @LucyStats @colin_fraser Lucy is amazing. If she were offering me free help, I'd would take it.

2023-04-24 22:16:05 @colin_fraser @teej_m Awesome. At least, I'm not completely crazy. This makes me even more frustrated that I don't have direct access to the GPT-4 API. I could do so much more! lol.

2023-04-24 21:16:26 @colin_fraser Looking at your code, it seems like maybe this stuff doesn’t work for GPT-3.5 which seems plausible me. It’s also possible that asking for the answer in a specific format ended up competing with the other instructions. Thanks for your work. This is helpful.

2023-04-24 21:08:24 @colin_fraser Thanks. This could be helpful with trying out new ideas in the future.

2023-04-24 21:04:53 @gabrielbodeen I just wanted to show some approaches to improving prompts.

2023-04-24 21:01:59 @ksuhre Not sure but I’m currently reading this book by @stephen_wolfram. https://t.co/tuUoc3svPP

2023-04-24 20:47:09 @michelnivard I just mean it’s random. So if could just get it right by luck but mostly get it wrong in general. That’s why I repeated my queries in a new window.

2023-04-24 20:43:42 @colin_fraser I’m surprised but if what I’m saying is wrong, I’d like to know since I’m wasting my time otherwise. Can you share some details of how you parse the final answer since my approach does cause it to list multiple solutions sometimes? (Would love access to your code. DMs open)

2023-04-24 20:34:39 @sarah_grabinski I didn’t but that makes sense as a modification to my instructions.

2023-04-24 20:32:56 @michelnivard I’m not thinking you made this up at all. I wouldn’t expect my modifications to be 100% effect. Just curious because for me it pretty much always “objective” suggesting it happens with high probability.

2023-04-24 17:15:05 RT @kareem_carr: ChatGPT consistently gets this very basic question wrong. Does that mean ChatGPT is useless? Not necessarily! Using this…

2023-04-24 17:11:51 @michelnivard Did you try it more than once in different sessions? I literally have never had it answer correctly without extra guidance.

2023-04-24 16:05:35 @wtgowers I guess coal is a goal for some people. https://t.co/OXZqLVj45J

2023-04-24 15:21:58 @J_JPerezCano Thanks. This is a new term for me. https://t.co/ChnCbVKdwh

2023-04-24 15:10:17 It picked "culmination" which seems reasonable so analysis complete!!! https://t.co/UPohOXIRx4

2023-04-24 15:10:16 TIP 4: USE PRIOR ANALYSES to enhance your final conclusion. In this case, I found the options: - checkpoint - challenge - conclusion - criterion - culmination Which should I pick?

2023-04-24 15:10:15 In 4 out of the 5 cases it just spit out the answer. In one of the cases, it shared its "thought" process. ( It's machine. It can't really think.) You can see, it's able to follow the simple algorithm and get the right answer. https://t.co/htaGyWQCzZ

2023-04-24 15:10:14 Combining tips 1 and 2, I was able to get GPT-4 to generate a reasonable answer 5 out of 5 times!

2023-04-24 15:10:13 ChatGPT consistently gets this very basic question wrong. Does that mean ChatGPT is useless? Not necessarily! Using this as an example, let me show you how to take your prompts to the next level. https://t.co/7ImFmdG7np

2023-04-24 12:23:31 @philoso_foster I have no idea what people in that job market are looking for but that looks objectively awesome.

2023-04-24 00:06:00 Every day on here, I see people asking technical questions that ChatGPT would absolutely kill if given the right prompt. It makes me realize that not everybody is up to speed on what these things can do.

2023-04-23 23:30:27 @ben_golub @adamdangelo I thought what he was saying had a very reasonable interpretation as well. We could basically have more nuanced laws that better cover unique edge cases without suffering any extra cognitive burden. Thank for writing this up. I wanted to wade in but I try to pick my battles.

2023-04-23 22:39:40 @pwang It seems like R doesn’t suffer from these issues to anywhere near the same extent. Yet it also heavily utilizes C/C++ code on the backend. Why do you think that is?

2023-04-23 18:40:23 @fchollet @pfau I've noticed this issue manifesting in various forms for years. ML terminology tends to promote anthropomorphization: "training", "learning", "neural", "hallucination", etc. This is made worse by the blackbox nature of the algorithms and the field's cultural obsession with AGI.

2023-04-23 16:46:14 My work is a nice intersection between engineering type math (Fourier analysis for analyzing periodic signals) and statistics stuff (estimation theory) and also random biology stuff as needed. Some background reading for the curious: https://t.co/tF1aopd5iP

2023-04-23 16:46:13 Reading genetic clocks with math. https://t.co/NeHyeVWKS2

2023-04-22 22:23:59 @mpeg2tom This is exactly the kind of thing I had in mind!

2023-04-22 22:18:41 @NeuroStats @EpiEllie @LucyStats It would just invent one whether it had used that one or not!

2023-04-22 17:01:38 This focus on high paying industries seems to be true whether their parents are rich or not. [source: https://t.co/ykEp2WOMH5 ] https://t.co/vAQLAv8BVx

2023-04-22 17:01:36 One of the most defining features of Harvard students in my experience is not that they are smart or that they are overly invested in convincing other people that they are smart. It’s that they are laser-focused on getting a stable job that pays a crap load of money. https://t.co/SwjlAeJGZf

2023-04-22 16:44:23 I kept them. At zero dollars, they were kind of a steal! https://t.co/6OAc17xvx1

2023-04-22 16:30:37 @loganb I hate myself that I immediately knew this was Star Trek.

2023-04-22 16:27:18 @GMCarlier I refund a lot of things so I have a good feel for it. I think these are just unpopular items.

2023-04-22 16:26:29 @faeriella67 I assume that's true but I return books that cost a lot less all the time. I think it must be a combination of the price, the weight and the probability of re-selling it.

2023-04-22 16:23:59 They weren't that cheap either. One was $47 and the other was $53 USD.

2023-04-22 16:05:13 I recently ordered two different econometrics books off Amazon and then ended up deciding to return them. In *both* cases, Amazon has given me a full refund but told me not to send the book back. It's like "Yeah. We totally get it. Just throw it in the garbage if you want."

2023-04-22 15:27:26 @ben11kehoe Thanks*

2023-04-22 15:14:59 @LucyStats I think if it’s not a blackbox algorithm then it’s probably fine. I could also imagine using some kind of bootstrap approach to estimate uncertainty. I think mostly I’m worried that most people aren’t going to do any of that and will implement this in the stupidest way possible.

2023-04-22 15:08:35 @ben11kehoe That’s hilarious! That’s for sharing.

2023-04-22 15:07:07 @smilicic For older professors, that’s been my experience as well. I have read my share of ugly C and Fortran code written by mathematicians. I feel like things are a lot better these days with the rise of Python and more education on coding best practices.

2023-04-22 14:11:37 Controversial opinion: naming mathematical concepts after specific people is a bad practice. It just adds to the cognitive burden of an already extremely cognitively demanding subject. We should be using more informative names.

2023-04-22 14:10:13 @PrasoonPratham I really like this for you my friend! Glad you're doing well.

2023-04-22 13:42:05 “Should we be using AI to fill in missing data?” If the AI is a blackbox then ABSOLUTELY NOT! Modifying your raw data is a serious thing. You want to do it via a controlled and well-understood process. AIs like ChatGPT are currently none of those things. https://t.co/Yidydan1C9

2023-04-22 11:21:55 @joftius This is awesome. Thanks for sharing.

2023-04-22 11:20:06 RT @kareem_carr: This visualization differs from linear regression in two very important ways.

2023-04-22 02:29:42 RT @kareem_carr: This visualization differs from linear regression in two very important ways.

2023-04-21 19:40:33 @willkurt That sounds really cool. Looking forward to the write up.

2023-04-21 19:33:30 @ChelseaParlett @GoogleMagenta Your family sounds really special. May her memory be a blessing.

2023-04-21 09:19:12 RT @kareem_carr: This visualization differs from linear regression in two very important ways.

2023-04-21 00:00:01 CAFIAC FIX

2023-04-20 14:03:46 @ScottDMurdock1 Awesome!

2023-04-20 14:03:34 @andreamatranga Great idea. The right kind of springs would probably do it!

2023-04-20 12:24:35 RT @kareem_carr: This visualization differs from linear regression in two very important ways.

2023-04-20 10:54:37 RT @kareem_carr: This visualization differs from linear regression in two very important ways.

2023-04-20 03:11:09 RT @kareem_carr: This visualization differs from linear regression in two very important ways.

2023-04-20 02:41:16 @xDannyVdHaven there’s a total least squares package in r: https://t.co/D5pDV6xOH7

2023-04-19 22:02:06 RT @kareem_carr: This visualization differs from linear regression in two very important ways.

2023-04-19 21:50:47 @prateekpatel_in https://t.co/ZP5Ya6Hnrd

2023-04-19 21:49:44 One might suspect that the 1st PCA is the general solution even when there are more than two variables. This is not the case but the solution does involve PCA. See below and check this link for more details: https://t.co/LvjMeEQqIE https://t.co/MpHwnFJ6xh

2023-04-19 19:28:41 @ClimateOfGavin It's not traditionally what is meant by "linear regression". But I agree it's a "linear model" and also a type of "regression". So calling it "linear regression" would theoretically make sense but it would also confuse people and lead to lots of miscommunication.

2023-04-19 19:24:15 @MusingsOfDeimos Awesome!

2023-04-19 19:23:57 @EpiLeyla awww. been really busy lately. will be back to joking around soon lol.

2023-04-19 19:20:04 @Artui_uf Almost. I don't think the physical system in the video is minimizing the sum of squares orthogonal distance.

2023-04-19 19:18:31 @cfsaracho Not a correction. Just a bit of extra info addressing some of the commonly asked questions.

2023-04-19 16:39:44 A few notes based on the comments: - Total least squares regression is the general problem of minimizing the sum of the squared perpendicular distances between your model and a set of points - The 1st principal component is the line that does that for two variables https://t.co/USp0YgaUlV

2023-04-19 16:26:59 @smilicic Linear regression solves the problem we typically care about which is the error in our *predictions* of some unknown y given some input variables. The process in the video minimizes error in all variables. It's a good visualization of how to solve a different problem.

2023-04-19 15:38:49 @zero132132 The process in the video is fitting a linear model. All things being equal, it ought to be fair to say it's a kind of "linear regression" but for historical reasons, it's not what we mean by linear regression. Relevant terms: total least squares and orthogonal distance regression

2023-04-19 15:08:24 1. Linear regression optimizes the vertical distances not the perpendicular distances. https://t.co/kBS4I3wriL

2023-04-19 15:08:23 This visualization differs from linear regression in two very important ways. https://t.co/86dJWWui6a

2023-04-19 14:08:49 *before chatgpt* me to journalists: hope you know how to code LMAO!!! *after chatgpt* me to journalists: hope you know how to write crisp, clear text that expresses the intricate nuances of a complicated problem LMAO!!!...oh wait...

2023-04-19 13:54:02 @RidleyDM Yeah. Cryptocurrency-related computations are another kind of computation that seem to get special scrutiny in addition to AI.

2023-04-19 13:52:18 @agostino_harry Alas. In this case, I'm talking the greenhouse emissions. It seems to be a very big chunk of the internet.

2023-04-19 13:48:04 I get the impression that many people feel like it's a bit irresponsible to discuss the benefits of AI without also discussing the environmental costs. It just seems like people don't apply that same standard to computer science in general and that feels odd to me.

2023-04-19 13:36:00 I don't understand the concern about the environmental impact of AI because AI doesn't strike me as a particularly frivolous use of computation. Huge amounts of energy get wasted on digital porn and video games but I haven't seen too many people worry about that.

2023-04-19 10:35:21 @karlrohe Not yet unfortunately.

2023-04-19 01:20:24 @PhDemetri No joke. This is happening. Lol.

2023-04-18 23:52:14 I’m cracking myself up. https://t.co/1NCNLbqMlE

2023-04-18 23:24:32 @gusthema I have zero right now as a newbie. hopefully that changes soon.

2023-04-18 23:15:23 I have 6 followers lol. it's been like two minutes. already liking this.

2023-04-18 23:13:55 I just joined blue sky!!! Username is kareemcarr.

2023-04-18 22:47:59 @AstroAlysa Yup. Not much place for nuance these days. But if we let them bully us into not freely expressing our thoughts then they win.

2023-04-18 22:30:54 I have the fancy Hagoromo chalk. It's not like I don't know better. I'm just objectively a bad person.

2023-04-18 22:25:45 I own a really nice blackboard. Vintage wood frame with metal accents at the corners. Absolutely beautiful. I'm planning to paint over the surface and turn it into a whiteboard. I have become ungovernable.

2023-04-18 22:15:25 A lot of folks find my use of Twitter polls confusing. "You're a statistician. Don't you know they're biased?" "Isn't the right answer obvious?" These polls help me understand the starting assumptions of my audience and this helps me write better.

2023-04-18 16:17:43 @Bbburner19 I don't understand. How can it feel without a body? When I think of emotions, I think about feeling butterflies in my stomach or my heart pounding in my chest. I don't know how to think about emotion outside of those kinds of body-based experiences.

2023-04-18 16:09:11 @jasserole Free country I guess but you have no idea what I'm measuring. I can't really discuss it either because that would just bias the poll.

2023-04-18 16:07:36 @lourencoserpa No. It says that I can put myself in the shoes of others even if I don't necessarily agree with them. (I'm not from the US by the way.)

2023-04-18 16:02:22 @jasserole A poll is a measurement tool not an opinion.

2023-04-18 15:56:54 @lourencoserpa The point of asking the question is to learn what other people think. I already know what I think.

2023-04-18 15:44:14 When it comes to AI sentience, what I really care about is the moral status of AI relative to humans. If humans invent a sentient AI that is about 10 times as intelligent as the average human, how would you compare the life (continued existence) of that AI to a human?

2023-04-18 15:29:05 Is your educational background in statistics? Do you think if we keep scaling up algorithms like GPT-4 that we could create an AI that is sentient?

2023-04-18 15:22:19 @XL5Peinado I think it comes down to funding. Without money to support public communication efforts, scientists have to prioritize their lives, careers and families.

2023-04-18 15:10:30 @skdh True. It shouldn't be the majority of the field. That would be chaos. Never fear. I've always been with in spirit on the state of certain subfields of theoretical physics.

2023-04-18 15:02:55 As I listened to this clip, I just kinda felt myself slowly give up on a deep internal level. No amount of tweeting is going to overcome this level of AI hype lol. https://t.co/5gfHmgZRth

2023-04-17 17:54:51 @vhoseam I typically ignore individual racists but it's not realistic to ignore racism in general as a black person. I don't care about these particular accounts. I already blocked them. I'm just using them as an example of a general problem that I think is worth discussing.

2023-04-17 17:16:20 I think of it as being able to project my high-dimensional understanding of an idea on to the closest approximating low-dimensional manifold which lies within the understanding of the audience. This tasks requires a deep, almost fractal, level of understanding.

2023-04-17 17:16:19 As an "expert", I don't feel like I truly understand anything unless I can explain it in a way that a layperson can understand.

2023-04-17 17:04:41 @LigerzeroGaming Good to know. Thanks!

2023-04-17 16:43:19 @refudiatem I read it in undergrad. Loved it. It was my first introduction to principal components!

2023-04-17 16:41:10 @SouThernDreAmz4 I don't agree with your summary but putting that aside for now. Treating me (a black person) negatively because of resentments toward black people in general is literally racism. You do realize that right?

2023-04-17 16:23:12 It’s kind of sad when people are so full of racist brain worms that they can’t even have a rational conversation with a black person about math and science without bringing up that person’s race and inventing all kinds of weird, racist conspiracy theories about their motivations. https://t.co/jKRDlusGEs

2023-04-17 10:41:34 @MoSan91 Science isn't just data. It's also analysis, and data isn't just a bunch of numbers reported with minimal context.

2023-04-17 10:12:09 @lastpositivist In terms of policy, it doesn't seem that relevant to me whether differences are innate. What does seem much more relevant is that substantial differences might continue to persist given a cost-prohibitive level of intervention. Innate differences would only be a subset of those.

2023-04-17 09:17:47 @Iamnoturenemy I would give it a pass on rigor if people weren’t quote-tweeting it like it was science. I don’t mean to beat up on this one person though. It’s just an example.

2023-04-17 09:15:13 @jonatanpallesen Journals do peer-review. Articles have methods sections where the method is described. They often have supplementary documents with implementation details. What is the equivalent here? I stand by my critique that this image represents a culture of pseudoscience on Twitter.

2023-04-17 00:13:51 @Nullsci1 @cremieuxrecueil I’m asking for the details of the analysis. This is just basic peer-review on which modern science is built. It is not trivial.

2023-04-17 00:05:08 @cremieuxrecueil @Nullsci1 Do you have a link to your code/analysis? How can we say anything about your analysis we don’t know what you did?

2023-04-17 00:02:52 Once it becomes clear that anybody can fake anything using AI, there will be a demand for trusted social networks where credibility is safeguarded, and identities and credentials are verified.

2023-04-16 23:31:44 RT @kareem_carr: Statistician here. This is a kind of statistical pseudoscience which is extremely common on social media.

2023-04-16 21:06:44 If you’re going to talk to ChatGPT about your personal problems, I would go with GPT-4. GPT-3.5 is a dumbass. Verbose. Superficial. Simply the worst. https://t.co/erJIMvMJTR

2023-04-16 21:01:15 @lastpositivist You (@lastpositivist) might know better than me but it seems to me that the problem requires either direct access to the real world so the AI can test statements directly, or near perfect inferential heuristics (much better than what we humans could do).

2023-04-16 20:57:39 @lastpositivist I think it’s inherent to the estimation problem, not just the technology. How could you determine purely from text (much of which is false), what statements are factual? Surely “truthiness” is not a property that lies within the text itself.

2023-04-16 20:44:05 @rappa753 It’s ok to use diagrams to communicate a concept but when they’re being used as scientific evidence, they need to be rigorous.

2023-04-16 20:41:21 @dborcic Yes. I think so.

2023-04-16 20:39:56 @SarahAvraham1 Yes. Presenting decontextualized conclusions wrapped in pretty visualizations as evidence of a claim is pseudoscientific. It evokes the trappings of science while avoiding the scrutiny it requires. It’s not irrelevant to bring what these visualizations are being used for.

2023-04-16 14:42:51 Many people see statistics as a tool for identifying natural laws of human differences. Usually the differences between men and women or between races. People like this do not really care about statistics itself so they tend to never really understand it.

2023-04-16 14:42:50 This is barely statistics in my eyes. There is no discussion of the uncertainty in the claims made. These is no evaluation of the robustness of conclusions to changes in assumptions. All I'm seeing here is bunch of unsubstantiated lines on a histogram.

2023-04-16 14:42:49 On top of that, looking at the plot makes it seem like you can "see" the underlying data. This makes it seem transparent and thus more credible. But *can* you see data? Why didn't the creator make the raw datasets and code publicly available for critique?

2023-04-16 14:42:48 Statistician here. This is a kind of statistical pseudoscience which is extremely common on social media. https://t.co/2OM24VGBLL

2023-04-15 17:42:28 @brodavvg On a serious note, it’s not a terrible list. If you put a few weeks into each item, I’m sure you could get a job. It’s just funny how learning statistics, which is massive field that’s more than 100 years old, is put on the same level as learning Tableau or Excel.

2023-04-13 00:13:21 It’s behind a paywall but premise sounds interesting. (Nobody’s asked me to drop out yet…)

2023-04-13 00:13:20 An unexpected side effect of the AI gold rush: “Most Stanford Ph.D.s have some startup or company that’s trying to get them to drop out” https://t.co/IL35NPWF21

2023-04-12 23:49:03 RT @kareem_carr: the perfect t-shirt does not exi— https://t.co/GtRriCUT1o

2023-04-12 22:41:10 Many view data science as just another career. To me, it's a specific approach to problem-solving. It's a collection of techniques for extending the scientific method to everyday questions.

2023-04-12 15:54:36 @ElJay314159 We can argue over what word to use (since most people are not going to know what a hidden state is) but yes I mean that you can guide it to the right hidden state.

2023-04-12 15:16:18 The number one mistake I see people making with LLMs is they think if an LLM doesn't know how to do something by default then that's the end of the story. They don't realize LLMs are teachable.

2023-04-12 14:01:03 the perfect t-shirt does not exi— https://t.co/GtRriCUT1o

2023-04-08 11:30:45 RT @kareem_carr: Time is short! Twitter has started blocking links to S*bst*ck. This is terrible timing for me since I was planning to star…

2023-04-08 03:27:25 @LKaboolian I DMed you the link. Let me know if it works for you.

2023-04-08 02:06:13 You might see this https://t.co/ZOzybFhGeb

2023-04-08 02:05:30 @lora_not https://t.co/ZOzybFhGeb

2023-04-07 15:39:48 I’m using a bit ly link in the last tweet as a work around. Try liking or commenting on this tweet and you’ll see what I mean: https://t.co/q3dBFuDjar

2023-04-07 15:39:47 Time is short! Twitter has started blocking links to S*bst*ck. This is terrible timing for me since I was planning to start one this year. Join up now while you still can so you can get updates. https://t.co/v3Mu57bNTd

2023-04-05 23:20:49 @GidMK @GaryMarcus @ylecun I agree.

2023-04-05 23:15:04 @GidMK @GaryMarcus @ylecun I get that LLMs are often wrong but empirically how much physical harm has resulted from these mistakes? It’s almost certainly negligible at this point. The more we can remind people of the error rates the more likely it is that physical harms will remain low.

2023-04-05 23:11:18 @GaryMarcus @GidMK @ylecun The rate is higher but the overall impact is lower. A terrorist attack is more lethal but you’re more likely to die from heart disease. Probably orders of magnitude easier right now to find somebody who was hurt by following guidance from a newspaper than an LLM.

2023-04-05 23:04:36 @fakr00n I mean cars as they are now since that’s what we have on the roads right now. I think we’d literally never allow near universal access to these huge, lethal machines.

2023-04-05 22:59:52 @DigitalWatches Maybe it’s good that we wouldn’t but it seems very obvious to me that we wouldn’t. We’d be much too scared of handing that kind of power to basically anybody.

2023-04-05 22:53:39 @GidMK @GaryMarcus @ylecun Every medium of speech is a source of false statements. I think it would be hard to prove that LLMs are a larger source of false information than books, websites, magazines, television or other humans. This may change of course.

2023-04-05 22:42:16 Imagine we’d just invented cars. Do you think we could make them widely available in this political climate? Four thousand pound metal machines that can hurtle along at 100 miles an hour accessible to any adult human that wanted one? We would never.

2023-04-05 22:30:43 @GaryMarcus @ylecun This seems like a strong claim but maybe the harms are smaller than I’m thinking. What kind of harms do you have in mind?

2023-04-04 17:32:48 @ZachariahNKM @OpenAI @sama I'm thinking forward to what scientists would need in order to incorporate GPT-4 into data analysis workflows. If it's not reproducible that severely limits its scientific applications.

2023-04-04 17:30:54 @OfficialLoganK @OpenAI @sama Lack of some equivalent to random seeds isn't a deal breaker but it means using more computation to average out the randomness. Is deprecation a choice or a necessity?

2023-04-04 16:27:40 @arthurzqx @OpenAI @sama What does "in the air" mean? Like deployable? I'm not a backend person in anyway whatsoever.

2023-04-04 16:25:37 The tendency for large language models to hallucinate is a huge obstacle to using them for scientific analyses but I don't think it's insurmountable. We can apply statistical tools from survey science to extract the signal from the noise.

2023-04-04 11:39:33 @pfau @OpenAI @sama Demonstrating scientific utility could go a long way to building credibility and good PR.

2023-04-03 22:48:33 I don't know how interested @OpenAI and @sama are in making GPT-4 into a research tool, but I think the first thing we'd need to do is establish reproducibility. To this end, we need: 1. build numbers (so we know the version) 2. random seeds (so we can reproduce random behavior)

2023-04-03 16:09:35 MATHEMATICS: Bayesian Statistics vs. Frequentist Statistics

2023-04-03 16:09:34 PHILOSOPHY: "Probability is a measure of belief" vs. "Probability is a measure of how frequently events occur"

2023-04-03 16:09:33 There are three separate but equal aspects of statistics: 1. Philosophy 2. Mathematics 3. Science Here's an illustration of how they play out:

2023-04-03 09:32:07 RT @kareem_carr: One of the most profound books, I've ever read is called "How to Read a Book". It's where I learned that there were levels…

2023-04-03 02:51:42 RT @kareem_carr: One of the most profound books, I've ever read is called "How to Read a Book". It's where I learned that there were levels…

2023-04-02 18:36:37 RT @kareem_carr: One of the most profound books, I've ever read is called "How to Read a Book". It's where I learned that there were levels…

2023-04-02 14:17:12 @annttiigs Yes!

2023-04-02 10:36:15 @CarlosVallejoBC Great question. I think so. Part of the reason I went through the exercise of writing this thread is to clarify the process in my mind so I could think about how one would automate some or perhaps all of it.

2023-04-02 08:42:35 RT @kareem_carr: One of the most profound books, I've ever read is called "How to Read a Book". It's where I learned that there were levels…

2023-04-02 01:36:11 @clkbsfth The idea to white this thread actually came to me because I was thinking about how to use ChatGPT for reading.

2023-04-02 01:32:59 @reef_building Yes. Exactly so!

2023-04-02 01:31:50 @aphofer Yes. The quiz show stuff is such an unexpected side story. What a cool anecdote that he was your neighbor.

2023-04-01 22:26:50 *the first generation of programmers after gpt-4 learns how code* https://t.co/FwRrYBWvgU

2023-04-01 22:04:05 @BrianCHolt I’m planning to read more about Zettelkasten. It seems related although maybe more tilted toward exploration and discovery.

2023-04-01 22:01:02 @voigt_peter The trick is just to read it more than once.

2023-04-01 17:37:59 @UnclePaulyMath You’re welcome. Good luck!

2023-04-01 17:37:12 @David_NeverDave You’re welcome. Cool that you independently rediscovered this approach. It’s a very powerful way to read!

2023-04-01 17:34:17 @intrepideshiet1 I think it is probably. Legally available, I don’t know.

2023-04-01 17:32:22 @CravenRave Did you get to the part in the book where they talk about skimming books? This book is actually extremely skimmable by design.

2023-04-01 17:28:02 @ae_fernandes @eric_is_weird Using footnotes to identify potential reading material sounds like a really good idea to me! Only caveat is they were compiled for the book author’s purposes (their syntopic project) not yours. So you might have to do some filtering.

2023-04-01 17:23:14 @kenferrell It’s all about getting the ideas I guess. Doesn’t matter how you get them.

2023-04-01 17:16:08 @LandonSchnabel I have read it multiple times too. It gets better each time. So insightful! (I also got made fun of. )

2023-04-01 16:32:51 Thanks for reading! If you like this kind of content, follow me so you don’t miss out on upcoming threads. You can also support me by liking and retweeting the thread.

2023-04-01 16:32:50 I can't describe it but when you get to this stage with your topic, you will feel a deep sense of accomplishment like arriving at the end of a long hike. Your knowledge on the topic will feel authoritative, grounded and well-earned. You will feel like a scholar.

2023-04-01 16:32:49 6. ANALYZE THE DISCUSSION. This is where you answer the question you had at the start, the one that launched you on this journey. Having constructed a shared language and identified the big questions, describe what everybody is saying to each other and most importantly WHY.

2023-04-01 16:32:48 5. COMPARE ANSWERS. Identify how your authors answer your central questions and in particular, find out how they differ. The places where they differ most will tend to be the most important questions.

2023-03-31 23:21:52 @leils @ElieNYC @MarkHamill @AOC @JBMatthews @BobKarpDR @SAGES_Updates @marklewismd howdy! https://t.co/a6uTGqcgiR

2023-03-31 19:51:17 I thought the author_is_elon variable in the twitter recommendation algorithm code was a joke. but it's not... https://t.co/teJG0NO1C1

2023-03-31 19:03:37 never thought it would happen but @elonmusk just released the twitter recommendation algorithm. https://t.co/B7cNAfm4Er https://t.co/P24hTvGLAS

2023-03-31 18:45:46 @Butter___Man Maybe I should have said "no single real world interpretation". Not sure.

2023-03-31 18:26:49 @Swandraga https://t.co/NOfamfknq0

2023-03-31 18:25:28 they scrape up the stuff at the bottom of beer barrels, mix it with salt and eat it. truly unhinged stuff. https://t.co/XYPA48AYDk

2023-03-31 18:10:53 if you've ever seen what british people eat, this should be extremely disturbing to you. https://t.co/NVRCLvYWna

2023-03-31 18:04:43 @ZetaOf1 I'm not a mathematician. I'm a statistician (biostatistician). I'm in the business of interpreting data as evidence in support of or against various scientific propositions. I think Fisher saw his role in much the same way so it seems like it's been like that since the beginning.

2023-03-31 16:23:42 For a long time, I thought my preferred metaphors were superior. More sane. More grounded. More fundamental. I see now that these variations in mathematical metaphor are all pathways to building models that tap into the genius at the heart of our visual and physical intuitions.

2023-03-31 16:23:41 Frequentists use mathematical metaphors for frequency. Bayesians use mathematical metaphors for belief.

2023-03-31 16:23:40 I've been thinking about different mathematical constructs as metaphor. You can often tell what kind of mathematical scientist a person is by the flavor of their mathematical metaphors.

2023-03-31 15:30:51 @ZetaOf1 I'm not sure the issue only lies in operationalization because we typically have to interpret statistical findings as scientific ones which requires us to translate probabilistic statements into some kind of statement about the world.

2023-03-31 15:11:00 Did you know that statisticians don't have a consensus on what "probability" is? Leading theories are: 1. the frequency with things happen 2. a measure of belief 3. a purely mathematical construct with no real world interpretation

2023-03-31 14:51:39 academics: communications isn't real work also academics: why won't more people engage with my research? academics: marketing isn't real work also academics: why won't more people use the solutions i provide? academics: management isn't real work also academics: why wo—

2023-03-31 14:09:00 It's common for people (idiots on social media) to assert that engaging with "2+2=5" type arguments implies one would be bad at engineering, but if you go into engineering thinking equations apply exactly without fudge factors, your bridges will most definitely collapse.

2023-03-31 13:49:14 I’ve been having a tough time lately. A severe loss of motivation. Anxiety. I'm incapable of enjoying simple pleasures like typesetting proofs of asymptotic properties of statistical estimators in LaTeX. Even using git brings me no joy. Dear followers, what should I do?

2023-03-30 18:49:07 @chrislhayes @oneunderscore__ I think the new AI tech is on the scale of the internet but it was a long road from dial up to Zoom. I think AI will be disruptive but only a little at first and then little by little such that we barely notice it. Look at how normal being always connected feels to us now.

2023-03-30 18:07:35 I understand why so many are concerned about AI but in terms of concrete, observable effects of GPT-4 and its siblings, there's been almost no effects. It's 99.999% hype at this point. People are calling for a slow down. A slow down of what? Literally nothing has happened!

2023-03-30 15:13:24 “can’t stop. won’t stop”, he said. https://t.co/whkmJDSUCj

2023-03-30 15:12:26 not a joke. source: https://t.co/BNIiHEmsHp

2023-03-30 14:34:36 How is this in Time magazine??? I can’t believe this needs to be said but no we shouldn’t be murdering people over fitting mathematical models to data. https://t.co/rKqWmJQIKV

2023-03-30 14:08:00 GPT-5 uses p-values correctly GPT-6 is a Bayesian GPT-7 doesn't care. Just wants answers GPT-8 reinvents machine learning GPT-9 trains a *larger* language model so it can do data science via prompt engineering

2023-03-30 13:20:00 I have a dream that one day people from different statistical fields like mathematical statistics, econometrics, psychometrics and machine learning will have a conference every 10 years or so to standardize terminology and coordinate notation. It would be a golden age. https://t.co/kqGsIkGMyJ

2023-03-29 15:55:29 Starting April 15th, only Stats Blue subscribers can have significant p-values.

2023-03-28 22:27:50 These "AI will probably kill us all but I'm working super hard on it anyway" takes don't bother me because I think they're right. They bother me because it suggests a large number of people working on this extremely powerful tech are a little bit psychopathic.

2023-03-28 22:22:47 @PsycheNerd @svpino @Grady_Booch I think code (like any kind of theory) is hard to validate whenever it interacts with the physical world. For instance, you can't validate that a change in code was able to increase user engagement or improve the performance of a self-driving car without a real world experiment.

2023-03-28 22:10:35 RT @kareem_carr: @Grady_Booch A statistics perspective might be helpful here. LLM output probably does have some relationship to reality (…

2023-03-28 22:10:30 @Grady_Booch A statistics perspective might be helpful here. LLM output probably does have some relationship to reality (since they are derived from a distribution of sentences that are related to reality) but that relationship is probably stochastic. https://t.co/2675VMgJhO

2023-03-28 22:02:38 @svpino @Grady_Booch Yes. I would for sure agree and I think it's even possible to automate validation, but for now, most ChatGPT output isn't validated.

2023-03-28 22:00:36 @PsycheNerd @svpino @Grady_Booch Sure. I would find that much more convincing as a engine for creating truths.

2023-03-28 20:57:50 @svpino @Grady_Booch I see your point but I think the new code is the equivalent of a hypothesis in science that needs to be validated through experiment. The programmer still needs to verify the code runs and that the outputs align with the intended purpose for it to be a “new truth”.

2023-03-28 17:13:49 If you read a thread about ChatGPT and it doesn’t talk about hallucinations or how to deal with them then I’m sorry but that person is selling you a pile of bullshit.

2023-03-28 15:59:52 @ClausWilke Unfortunately, I think the real motives behind all this are closer to what Sarah outlines here. https://t.co/J642jvLWQO

2023-03-28 15:55:08 @LucyStats What's that font that you're using for the code in upper left image. It's kinda .

2023-03-28 03:31:54 @g18_ian @ConceptualJames Thanks! I appreciate the support.

2023-03-28 03:31:04 @nick_ash_ @ConceptualJames Thanks for the support! Glad you enjoy the content.

2023-03-28 01:21:09 @CSatisficer it’s good for getting a sense of who is interacting with your tweets.

2023-03-28 01:16:32 twitter polls were always pretty bad (biased toward your followers) but now they’re going to be pretty much useless.

2023-03-28 01:13:52 me, data scientist: * silently screaming * https://t.co/UK077Slt8L

2023-03-28 00:03:39 I know this is most likely a diss of journalists but it makes a lot of sense to me. Like ChatGPT, journalists are probably absorbing information at a purely linguistic level because the underlying expert-level mental models aren’t directly accessible to them. https://t.co/JDUCeVOIEU

2023-03-27 23:07:03 @AbhiGhos I agree with you that it will be extremely disruptive especially in jobs where it’s good enough if you get things 80-90% right.

2023-03-27 22:52:00 @Rabid66 @ConceptualJames It’s a win for me if you (or others in the thread) learn something. If you learn nothing then I guess I lose but you lose too. ¯\_()_/¯

2023-03-27 22:35:20 @Rabid66 @ConceptualJames sounds like you don’t know much about me or what statisticians do for a living, which is fine, but let me introduce you to a taste of mathematical statistics: https://t.co/7Pr5SniH4A

2023-03-27 22:19:30 @ConceptualJames https://t.co/lRijvcm50p

2023-03-27 22:19:07 @ConceptualJames Here’s a nice thread about the combinatorics of counting clusters : https://t.co/8XYHbyZ8kJ

2023-03-27 22:15:34 @ConceptualJames Hey folks. I’m not a fan of this culture stuff. I post about math and statistics 99.9% of the time. Follow me in you’re into that. Here’s a sample of my typical math threads: https://t.co/LmYhpIKyZ8

2023-03-27 21:59:10 @picadiliensis @ChrisMurphyCT “It decided to teach itself”

2023-03-27 21:44:26 @aysyfin This is a hard question. If you give it lots of sentences, it’s designed to find patterns in them and give you more sentences like the ones you fed into it. Some of those new sentences might be interpretable by humans as new ideas but this is almost a side-effect of the process.

2023-03-27 21:19:31 I don’t correct everybody that’s wrong on the internet but @ChrisMurphyCT is a US senator. Seems kinda dangerous for him to incorrectly perceive ChatGPT as an autonomous, sentient agent acting independently of the for-profit company that created it.

2023-03-27 21:13:18 Data scientist here. This is impressively wrong. - It didn’t learn chemistry. It learned to *talk* about chemistry - It’s trained on humans writing about chemistry - It’s designed to learn language - What it learns is decided by OpenAI’s training data - OpenAI made it available https://t.co/KfyLfUErzh

2023-03-27 20:13:08 @sarahradz_ https://t.co/4Umibvzci8

2023-03-27 20:10:40 @cuevaperotti I spend a lot of time working with stochastic processes and function spaces. Not surprised that it comes out in how I talk about things.

2023-03-26 01:41:51 @thebirdmaniac why does he still care about this lol?

2023-03-24 18:15:41 @ChelseaParlett @daniela_witten Will be nice to have some vetted python alternatives. I don't have a very good sense of which libraries make good statistics assumptions.

2023-03-24 18:10:42 @ChelseaParlett now tell me why do I know your dog on sight? didn't even read the tweet first. just clicked in.

2023-03-24 17:38:48 @karlrohe @edgardocer yikes!

2023-03-24 15:56:02 I would just like to see them get the hang of humans first before attempting to get into the minds of robots that don’t even exist yet.

2023-03-24 15:53:11 I’ve noticed a heavy overlap between people who think future artificial general intelligences will totally for sure want to kill us and the kind of people who are generally very bad at understanding the motivations of humans who aren’t part of their specific group. Coincidence?

2023-03-24 15:22:57 @LogicalBadger It will be interesting to see what happens once the corporations start charging real money for access to the tools.

2023-03-24 15:21:40 @Mr_jrlawson Not sure how to interpret that. The government is often an early adopter of tech (usually the military). Often they are even part of the group inventing the tech!

2023-03-24 15:17:58 @edgardocer Any key insights that you'd like to share?

2023-03-24 15:12:12 Based on recent history, I think it might take a decade or so for ChatGPT to truly change how we work. Much like with computers, the internet and smart phones, there were early adopters but the general public took decades to incorporate them as the standard way of doing things.

2023-03-23 16:11:03 @skdh

2023-03-22 22:14:34 @ChelseaParlett Yikes! That student sounds completely clueless.

2023-03-22 20:58:37 @turnpikeops Yes. I’ve had that experience of fake functions as well. It’s very strange because the real versions of the function weren’t that hard to find on google.

2023-03-22 20:56:34 @ResearchChat I’ll have to try that. What kinds of things do you say to constrain. Do you just say stuff like “Be careful to only use real sources.”

2023-03-22 16:28:57 @ConsumingChalk A lot of the times, I'm also just guessing and praying.

2023-03-22 16:28:14 Sounds useless *but* the title of the pretend reference did give me some good search terms and I eventually found what I needed lol. I find these AI helpful but often it's in a roundabout way.

2023-03-22 16:15:37 Asked Bard AI a question. Instantly gives me an answer with a reference without me even having to ask! Dear reader, the topic of the reference is *exactly* what I needed. At this point, I'm like "ChatGPT is trash!!!" I google the reference. Doesn't exist.

2023-03-21 15:40:48 The book is 112 pages! https://t.co/FsvuhNCDbz

2023-03-21 14:56:42 @BarbaraFantechi I also had questions about what they meant by mathematician. I suspect they mean applied mathematician roles in industry.

2023-03-21 14:36:27 @covidiots999 High risk yes.

2023-03-21 14:11:33 Data sciences roles which tightly integrate mathematics with critical thinking, scientific analysis and subject matter knowledge are likely to be much more resistant to disruption.

2023-03-21 14:11:32 Additionally, by multiple measures, the report rated Mathematician as one of the most vulnerable jobs with 100% exposure to disruption.

2023-03-21 14:11:31 The report doesn't directly discuss data science roles but we can draw some conclusions.

2023-03-21 14:11:30 Is Data Science at risk of disruption by ChatGPT? Based on a recent report by OpenAI, I conclude the risk is low. https://t.co/cGiGpPqkF1

2023-03-20 18:02:16 @ChelseaParlett The plan is to make a few billion dollars. That is the full extent of the plan.

2023-03-20 15:49:20 Am I the only one that remembers the timeline where more than 10 years ago IBM built a computer that wiped the floor with two best players in the history of Jeopardy? https://t.co/7phC9kONIP

2023-03-20 14:02:43 @NoahHaber @OSFramework Congrats!

2023-03-20 13:26:45 @prisonrodeo @kokuraclouds Thanks. I try! Social media makes it so very easy to escalate.

2023-03-20 12:40:38 @AdamRutherford Not sure anybody knows but my best guess is the networks are hierarchical and the decision to draw a mouth happens at the top and then the lower parts are dispatched to create each tooth. Teeth are simple repetitions of a similar unit which suits the network structure but they… https://t.co/EiJvd1TMLE https://t.co/3fWj6FtuhN

2023-03-20 12:05:58 @kokuraclouds You might not know this but the entire job of a statistician is to nitpick technicalities in data analyses especially in situations where they can completely flip the final conclusion (as is the case here).

2023-03-20 12:01:30 RT @kareem_carr: Statistician here. You should not control for all relevant variables. For instance, sexism is a relevant variable but cont…

2023-03-20 12:01:18 @GGCanto @jonatanpallesen The main issue is his interpretation of what it means to successfully control for everything is wrong. It doesn’t mean there is no discrimination. It means you’ve captured most of the relevant variables. The next step is to investigate their causal structure regarding sexism.

2023-03-20 11:01:47 @jonatanpallesen The longer version of the argument is there are many proxy variables for sexism such that controlling for everything ends up meaning that you indirectly control for sexism.

2023-03-20 01:47:24 @yudapearl Statistics assumes there’s a human in the loop. It’s not fully formalized but it’s a background assumption of mathematical statistics that the analyst has a mental model of the world that’s guiding the analysis. I’m not sure we’ve ever built a machine that plays this role.

2023-03-20 01:26:40 Humans have a strong tendency to use our intelligence to seek social dominance because we are primates. AI isn’t going to behave like a primate unless we intentionally engineer it to be that way. https://t.co/PdHZ4R2j1k

2023-03-20 01:17:58 @yudapearl Do you mean an understanding of the mathematics of causal inference or knowledge of the casual structure of the universe or perhaps a combination of both?

2023-03-19 22:48:02 Statistician here. You should not control for all relevant variables. For instance, sexism is a relevant variable but controlling for sexism would be a terrible idea. https://t.co/Bldl43bHsl

2023-03-19 22:05:22 academic life before browser tabs and pdfs in “read later” folders https://t.co/umnuGPxLnl

2023-03-19 22:01:49 RT @kareem_carr: @bayesianboy How about a circle? I always felt like figuring out that pi was the relationship between the circumference an…

2023-03-19 22:01:45 @bayesianboy How about a circle? I always felt like figuring out that pi was the relationship between the circumference and the diameter of a circle was a big achievement for humans. https://t.co/KrJHlRitPk

2023-03-19 20:49:04 @yudapearl Can you say more about what kind of logic (that takes “data + assumptions” to “conclusions”) is required in order to make a data analysis procedure count as an AI technique?

2023-03-19 18:47:45 @yudapearl I’m not sure “AI” is the right framework here. Data analysis workflows can span multiple levels of automation and the degree of automation doesn’t seem to me to be the most relevant feature with respect to the overall usefulness of the analysis.

2023-03-19 14:39:34 @statsepi folks are not fans of this take lol. but on the scale of optimistic GPT takes, it barely even registers.

2023-03-19 14:14:55 @ahugheswriter Recently, I’ve been learning about object oriented programming in R. There are about 5 different independent systems for doing this. So it’s complicated. I ask GPT for code using each system. It’s often wrong but I can typically patch it myself or with the help of some additional… https://t.co/P8PG2JGNJ8

2023-03-19 13:45:38 I think we are experiencing a period where AI is going to do for programmers and lots of other fields what it has long done for chess players. It will massively flatten the learning curve for lots of people without access to high quality human tutors.

2023-03-19 13:45:37 In my experience so far, I’ve been picking up new ideas a lot faster with GPT’s help and the fact that it hallucinates has not been a problem because it keeps me on my toes and forces me to be an active learner. https://t.co/xGcz9bouoL

2023-03-19 13:43:41 @logorific2 From a statistics perspective, what’s missing from this analysis is the level of uncertainty. You should have a very low level of certainty that you are right about the deck being all jacks of clubs based on just one observation.

2023-03-18 17:26:39 What I notice is LLMs: 1. act like a database 2. extrapolate in places where a traditional database would report no info 3. transform inputs (like natural language) and outputs 4. apply patterns in a nested way 5. check for consistency between patterns (more noticeable in GPT-4) https://t.co/Q9nkSIeSaw

2023-03-18 17:12:13 @Lak5h Exactly! Thanks for this example.

2023-03-18 16:31:39 There's been a bottleneck on how much software humans could create because only a small percent of people can code. I'm excited to see how GPT-4 empowers a generation of non-coders to solve problems that were never going to be worth the time of engineers making $100k+/yr.

2023-03-18 16:07:56 In some sense, nothing changes for me with AI. As a data scientist, I already spend all my time giving instructions to computers, getting their help with computations and vetting the quality of the output.

2023-03-18 01:42:19 I say thought “patterns” because a model of thought isn’t thought. A computer can talk about love without actually loving.

2023-03-18 01:42:18 We will move on to bigger and better ways of modeling thinking but the impact of that insight will remain.

2023-03-18 01:42:17 The greatest scientific contribution of large language models has been demonstrating definitively that our thought patterns, including all our poetry and prose, are just fancy mathematics.

2023-03-17 08:20:04 @FPallopides @Aella_Girl Why would AI have any coherent drives at all? They aren’t biological. We’re the ones born with powerful needs that drive us. I think people are equating “intelligence” with an independent propensity to want to make chances in the state of the world. We have that but AI need not.

2023-03-17 07:59:35 @Aella_Girl We are primates so we have a (probably) inherent propensity for establishing hierarchies and competing for resources and social dominance. AI are a completely alien intelligence. The only reason they would do primate-like things is if we engineered our drives into them.

2023-03-16 22:30:19 The other thing I was thinking is we can probably create an AI that can take in the general layout of a dataset and produce code to do at least a first pass analysis of the data. Oh man. Am I thinking myself out of a job here!??

2023-03-16 22:30:18 Just realized that it's probably 100% possible to train an AI right now today to scan an image of a dataviz and produce the code needed to recreate that visualization. https://t.co/WodiBHq9nR

2023-03-16 20:35:10 @sisneruza Yup. I use it quite a bit. It's not perfect for sure.

2023-03-16 16:58:25 @tunguz Being resistant to pivoting can be a solid strategy as well. Arguably, this was the approach of the people who developed neural networks in the first place despite multiple periods of pessimism.

2023-03-16 16:51:44 My pet conspiracy theory is Google could easily make Google Search more effective but the current version is optimized to get us to click ads, and once ChatGPT/Bing AI gets established enough to steal substantial market share, Google Search will automagically get much better.

2023-03-16 16:44:45 Despite my skepticism about GPT, I use it daily. I even pay for ChatGPT. I might be a skeptic but if this stuff has relevance for data analysis work, I'm going to make damn sure I'm one of the first to know about it.

2023-03-16 15:27:48 @pbrane This tech is going to disrupt all kinds of knowledge work. I suspect your kid will be writing code or something like it with AI assistance in their job just like the rest of us. lol.

2023-03-16 15:22:20 I get that we all hope GPT-4 might be able to do that some day, and maybe we are already there, but perhaps somebody ought actually try it first before declaring on twitter that GPT-4 can.

2023-03-16 15:16:41 This kind of stuff frustrates me. Thousands of likes and yet where is the evidence that GPT-4 can “write perfect tutorials”? This is pure fan fiction. https://t.co/VgtutJSZG1

2023-03-16 15:06:53 @Mateussf I don’t see how this is different from buying a book that’s full of conspiracy theories and reading it privately at home, or going to a lecture by a local kook, or hanging out on conspiracy forums. You can always find spaces where wild claims go unchecked.

2023-03-16 14:46:46 We're going to need huge innovations in content curation to match the huge AI-driven advances in content creation.

2023-03-16 14:28:48 @Mateussf And presumably you feel the individual reader can't be trusted to evaluate these claims for themselves and cross reference them with other sources if needed before accepting them?

2023-03-16 14:23:02 @SEthanMilne As with books and the printing press, there are concerns that regular people can't be trusted to figure out false information on their own.

2023-03-16 14:14:58 @Mateussf Historically it was a big concern that books would have wrong ideas in them (like Protestantism) and you couldn't trust regular people to figure that out for themselves.

2023-03-16 14:12:22 If we'd developed nukes via multiple private companies who were just trying to maximize profits, and spamming out poorly understood designs, we'd probably all be dead by now.

2023-03-16 14:12:21 - We fund their creation via organizations whose duty is to maximize profits - Our main design strategy is to construct them as poorly-understood black boxes - We design them around replacing rather than enhancing humans which is inherently adversarial These are all choices.

2023-03-15 21:47:44 @EpiEllie @LEGO_Group never fear. bilbo is writing a complaint letter for you. https://t.co/45wChNkPH6

2023-03-15 21:44:43 RT @EpiEllie: Hi, @LEGO_Group! Splurged on the Rivendell set because <

2023-03-15 13:54:50 @sarahradz_ @LongFormMath Looks legit.

2023-03-15 00:04:38 Looks like @OpenAI went with the the movie credits version of authorship for their paper. Cool! https://t.co/1FdF0QpgeU

2023-03-14 23:58:17 Controversial Opinion: Major deep learning papers have been going in this direction for a while now and this is just the obvious culmination. If it costs tens of millions of dollars to replicate a result then it might as well not be replicable. https://t.co/AUtYZxQKdm

2023-03-14 22:34:01 is this a life changer? yes. but not a big one. i don't get all these hyperbolic claims of disrupting knowledge work. that's not been my experience. everything it outputs needs to be double-checked for accuracy and possible misalignment with my instructions.

2023-03-14 22:34:00 i'm not good at googling stuff. at least once a day, i used to go down a rabbit hole where i'd be searching for an hour and not be any closer to finding the info i wanted. chapgpt has been a livesaver in this regard.

2023-03-14 20:19:10 @SolomonKurz If this was a school of public health or medicine class, I would consider it fine since this is the kind of data they'd be expected to work with in a professional context. For a general audience, a lot of people struggle with weight loss so some might find it triggering.

2023-03-14 18:24:05 My reaction to this is similar to if a human was boasting about their SAT/IQ scores. Just makes me wonder if OpenAI is posting about their AI's standardized test scores due to a lack of other significant real world achievements. https://t.co/tCwHGzeRKI

2023-03-14 16:23:00 When we do statistics, we're leverage two of the most powerful engines of knowledge creation known to humankind: mathematics and the scientific method. So why are statistical analyses so often wrong? Because an analysis is only as strong as its weakest assumption.

2023-03-14 16:20:14 @SolomonKurz I just mean the variance is considered the more fundamental concept. It could have gone either way since standard deviation and variance are functions of each other but the variance operator is much easier to work with algebraically speaking.

2023-03-14 16:13:18 @SolomonKurz I think it’s not considered a “first class” operator in the way E and V are. It’s a function of the variance but more awkward to use. E[X+Y]=E[X]+E[Y] and V[X+Y]=V[X]+Var[Y] for X,Y ind. but S[X+Y]≠ S[X]+S[Y]. So you almost always want to express things in terms of V not S.

2023-03-14 13:34:29 Happy Pi Day!!!… https://t.co/8nsRtyVl7Z

2023-03-13 22:07:50 @ChelseaParlett I don't know what the math folks would say but my mental model is a flexible, continuous surface like a wrinkled bedsheet. Most of the math follows from what you'd need to do to navigate on a surface like that (assuming you can't just fly).

2023-03-13 14:35:02 A lot of people coming up in data science these days seem to think the point of learning data science is so you can get a fancy data science job. I think the point is it empowers you to analyze data and come to solid conclusions which is useful no matter what your job is.

2023-03-12 10:11:28 @DevinGoure They are definitely elitist. I’m not saying I want this. I’m saying they have lots of options for preserving their business model.

2023-03-11 18:22:41 RT @kareem_carr: Back in the day reputation was everything. People would kill people just for calling them a liar. With AI generated nonsen…

2023-03-11 18:20:52 @lastpositivist He blocked me too despite having never interacted. It’s strange. I suspect it’s because I’m liberal-ish seeming.

2023-03-11 17:16:52 @mathematicsprof

2023-03-11 16:55:05 In the future, every statement will need a human being to vouch for it. We won't trust it without somebody that can be held accountable. Somebody to put on the wall.

2023-03-11 16:55:04 We've let ourselves be convinced that taking the reputation of the source of the information into account is some kind of bias or "ad hominem". This was a luxury. Soon all that will be impossible.

2023-03-11 16:55:03 But not too long ago, before the internet, that's exactly how it worked. You asked your friends, family and other people in your community for information and if they were wrong or didn't know then that was it. It wasn't perfect but it worked fine.

2023-03-11 16:55:02 Back in the day reputation was everything. People would kill people just for calling them a liar. With AI generated nonsense on the rise, I think the age of reputation is returning.

2023-03-11 16:24:06 @curiouskiwicat @RuxandraTeslo I think your comment assumes being fair to everybody is a higher priority than it is. If the alternative is total system failure, I don't think being exclusionary will keep the academic publishing industry up at night.

2023-03-11 15:55:08 @_ch_ase I know right. It *costs* money to publish papers.

2023-03-11 15:53:12 @RuxandraTeslo It would just be a filtering step to improve the signal to noise ratio before asking peer-reviewers to spend time vetting the work more thoroughly. So it’s not just about affiliation.

2023-03-11 15:31:10 People who publish in elite journals will happily go through the extra work to get past these filters and predatory journals don’t care about the quality of the paper since they get paid either way.

2023-03-11 15:31:09 This comment suggests low familiarity with academic publishing. Journals have lots of options for filtering who can submit papers like restricting to people with affiliations to academic institutions or requiring recommendation from a trusted list of recommenders. https://t.co/Sh4GOCpZXH

2023-03-11 10:27:14 @brian_is_tired I think I see where you’re going. I think you’re attempting to make sure a cluster always has at least one member. This approach isn’t precise enough about only adding an item to a cluster if you need it to make that cluster not empty.

2023-03-11 10:05:46 RT @kareem_carr: How many ways are there of dividing 100 items into 3 clusters? 85896253455335221205584888180155511368666317646 If *each*…

2023-03-10 16:34:44 RT @kareem_carr: How many ways are there of dividing 100 items into 3 clusters? 85896253455335221205584888180155511368666317646 If *each*…

2023-03-10 15:56:49 @dr_pete In practice, we can reduce the possibilities hugely if the items can be conceptualized as points in some space and we can assume that neighboring points are more likely to be in the same cluster.

2023-03-10 15:30:13 Thanks for reading! If you like this kind of content, follow me so you don’t miss out on upcoming threads. You can also support me by liking and retweeting the thread.

2023-03-10 15:30:12 In practice, the approximation kⁿ/k! is actually a pretty good estimate for realistic values of n items and k clusters.

2023-03-10 15:30:11 The explanation of why the formula for the Stirling numbers of the second kind is the correct formula is a story for another thread. The argument involves the inclusion-exclusion principle which I wrote about earlier this week: https://t.co/So9Ptt3o2k

2023-03-10 15:30:10 Writing code to compute these numbers from scratch can be a bit tricky because of the large numbers involved. If you'd like to play around with the Stirling numbers of the second kind in code, here's how to compute them in both R and Python. https://t.co/IgB1I8DuH1

2023-03-10 15:30:09 The exact formula, adjusted to exclude empty clusters, corresponds to a series of numbers called the Stirling numbers of the 2nd kind. They are denoted by the number of items n positioned on top of the number of clusters k and placed between two curly brackets: https://t.co/tn7EQs9gZu

2023-03-10 15:30:06 The formula we have so far can be generalized as kⁿ/k! for n items and k clusters but it's only an approximation.

2023-03-10 15:30:05 We can now see that our previous estimate 3¹⁰⁰ is over-counting by a factor of 6. Using this information to refine our guess, we get 3¹⁰⁰/3!≈ 85896253455335221839410188294270212117017920334 This is about 99.999999999999999% correct but not quite perfect.

2023-03-10 15:30:04 Now imagine you have just 3 different colored items to place in three clusters A, B and C where at least one item must be in each cluster. Note in the image below that these are identical clusterings. Only the content of the clusters matters not their ordering or labels. https://t.co/rR5Ye2FJ5S

2023-03-10 15:30:01 How many ways are there of dividing 100 items into 3 clusters? 85896253455335221205584888180155511368666317646 If *each* star in the universe exploded into as many pieces as there are stars in the universe, that's how big this number is. If that surprises you, read on:

2023-03-08 17:09:28 SOLUTION III: INTERNATIONAL TREATY According to Wikipedia, in the US and France, it's PEDMAS. For Canada, UK, Australia, Pakistan, India, Bangladesh, West Africa, it's BED/BOD/BIDMAS. Either way, for the good of humanity, we need to agree on a precise order of operations.

2023-03-08 17:09:27 SOLUTION II: MANDATE We could make it MANDATORY to always use parentheses as in 6÷(2(1+2)) or (6÷2)(2(1+2)). If we can all agree to pay a small tax in form of a little extra effort to read and write arithmetic expressions, we would never have ambiguity again.

2023-03-08 17:09:26 Sure that means we won't be able to write stuff like "6÷2" in purely text formats like a tweet ever again, but screw freedom of expression. Isn't it worth it to be safe?

2023-03-08 17:09:25 SOLUTION I: PROHIBITION We could all agree to NEVER use ÷ or the inline / as in "6/2(1+2)". Always write division like this: https://t.co/Tagylm3BYr

2023-03-08 17:09:24 The solution to this math problem is political. Read on to find out why. https://t.co/wFKHnnZ1Qf

2023-03-07 18:55:56 @chukpl @janettereinke

2023-03-07 18:31:00 @matthematician In my personal experience, teaching doesn't always imply learning.

2023-03-07 17:02:32 These types of posts are good actually. They teach people that mathematics is a deeply human activity that relies on shared conventions to function properly. https://t.co/3WVTQniJcw

2023-03-07 16:23:04 @DebraSJudge sounds groovy https://t.co/mM2oBbnelH

2023-03-07 15:38:06 hot take: individual expert opinion is a very small part of the overall science. https://t.co/NswK2dzLPC

2023-03-07 15:15:51 RT @LaughatWally: A fun follow for the math curious.

2023-03-07 12:40:35 @AndrewPGrieve Thanks for sharing.

2023-03-07 12:39:53 @YairAizenman Thanks. There’s some stuff about the confidence internal on the wiki page: https://t.co/TlxvhwmQQD

2023-03-07 12:37:14 @alscor1966 This isn’t the best estimator. It’s just an easy one to explain given the constraints of Twitter.

2023-03-07 12:35:53 @PiscisBailey No but sounds interesting! Thanks for sharing.

2023-03-07 12:35:19 @higgs_neil Thanks!

2023-03-07 12:34:51 RT @kareem_carr: Mark and Recapture is a powerful statistics trick for counting a large number of things without actually counting most the…

2023-03-07 09:47:01 @aryehazan It was an assumption that doesn’t hold true in general and may not have even have held true in the context I was using it. But I had run the algorithm 100s of times so I had strong intuitions &

2023-03-07 01:32:52 RT @mikejschmidty: A very good explanation of how to use mark recapture to estimate the total number of individuals in a population.

2023-03-07 01:32:40 @aylovedata Not off the top of my heard but the wikipedia article is not too bad: https://t.co/TlxvhwmQQD

2023-03-07 00:52:08 RT @kevinriggle: Kareem is constantly dropping stuff like this, extremely worth the follow. And if you like this I learned about it from HO…

2023-03-07 00:27:14 @momenyl

2023-03-06 21:19:17 @bgreenwell8 @wrightstate Awesome!

2023-03-06 21:17:53 @jakobpunkt You’re welcome! It can get more complicated as we try to improve on this estimator but the most basic version is pretty simple.

2023-03-06 21:13:54 @KarenCampe Yeah. Exactly. I would maybe write 6/53 ≈ 50/N as in they are approximately equal.

2023-03-06 17:20:32 @Darrenmacey Thanks for the support. I appreciate it!

2023-03-06 16:38:31 @trcull Yeah. That's right. This is the same as the assumption that a marked and an unmarked bird have the same probability of being captured. If the marked are slow and the slow are more likely to be captured then that would violate the assumption.

2023-03-06 03:11:30 RT @kareem_carr: The Inclusion-Exclusion Principle is a really powerful math concept. It starts out with a grade school level observation…

2023-03-06 02:44:07 @kate_eviva Thanks! I appreciate the support!

2023-03-05 21:10:48 @ChelseaParlett Yes. I think so. I would say burnout happens when you ignore your basic biological and psychological needs for too long which is more likely to happen when you are obsessed with some activity.

2023-03-05 19:30:21 RT @YanndeMey: Great thread by Kareem! I always loved Venn diagrams in school

2023-03-05 19:30:14 RT @kareem_carr: The Inclusion-Exclusion Principle is a really powerful math concept. It starts out with a grade school level observation…

2023-03-05 19:06:38 @JanLaalaa This sounds like a good idea. I’ll put it on my list of topics!

2023-03-05 11:40:46 @vboykis https://t.co/4Z5TgfZ5qL - work with latex in your browser, share with others, huge library of paper templates, version control, easy compilation https://t.co/eUO5LA5iNm - previews of short code, uses AI to translate pics of typeset equations and even handwriting into LaTeX

2023-03-05 10:00:00 CAFIAC FIX

2023-03-02 22:00:00 CAFIAC FIX

2023-02-27 16:03:17 @PhDemetri well deserved!!!

2023-02-27 01:00:00 CAFIAC FIX

2023-02-25 04:56:03 RT @kareem_carr: I'm starting a newsletter! Imagine if the New York Times wrote in-depth articles on stats and data science with the same…

2023-02-24 23:54:47 @dearisra Thanks for the support.

2023-02-24 18:02:00 @MathFrustration I’ll be writing them.

2023-02-24 17:39:00 @Brandonkw__ @TimHarford I assume they are reporting on the world using data analysis. So a bit different. I am planning to do reporting on data analysis itself. How various aspects of it works and why it’s important.

2023-02-24 17:22:26 RT @jtsveigdalen: I love this idea.

2023-02-24 17:22:21 @jtsveigdalen Thanks!

2023-02-24 17:21:33 @Kari_S_Listener Thanks!

2023-02-24 16:48:16 @Kari_S_Listener Heh. I wasn’t intending to get sucked into the culture wars here. Just trying to explain the level of detail.

2023-02-24 15:11:29 • Official launch will be in Late 2023/Early 2024 • All content will be free (but irregular) until launch • Paid tier: ~$5-10 for 2-4 long-form articles per month • Free tier: 1-2 short articles a month Details subject to change.

2023-02-24 15:11:28 I'm starting a newsletter! Imagine if the New York Times wrote in-depth articles on stats and data science with the same level of detail that it covers overseas wars or climate change. Enough info to get what's going on in the data world without being overwhelming.

2023-02-24 15:04:09 @minilek @CourseKata @getbootstrap Glad you liked it! It's been bothering me that people think statistics and calculus are incompatible. They are absolutely not. We need to build a movement around teaching data science the right way that empowers students. Especially students from marginalized backgrounds.

2023-02-24 14:48:27 @EmUprichard Sorry about that. I've never felt so crushed that Twitter won't let you edit tweets.

2023-02-24 13:09:34 @poopmachine I agree that examples in math class could in theory be practical and concrete but typically they aren't. Data science is definitely not optimized for teaching math, but what's good about data science is being practical and concrete with numbers is the default context.

2023-02-24 12:56:22 @minilek Hey @minilek. I wrote this partially as a defense of the other side of the argument. Would love to hear your thoughts. https://t.co/20K1i6OO16

2023-02-24 12:53:03 @poopmachine Fair question. The idea is it's a more concrete example that has practical relevance which I think is easy for (some) people to digest.

2023-02-24 12:46:15 @f2harrell @stephensenn @vandy_biostat I don’t think many of the assumptions we make about RCTs are any more verifiable but they do seem to hold in practice. One reason clinical trials are so complicated is we discovered over time that we needed legal enforcement mechanisms to compel RCT data to meet our assumptions.

2023-02-24 12:25:00 @love2laugh4ever I messed up the thread. Here's a correction: https://t.co/DBR5gg69u2

2023-02-24 12:24:26 @747Retired I messed up the thread. Here's a correction: https://t.co/DBR5gg69u2

2023-02-24 12:23:30 @tec_man0 Thanks for this! Sorry it took me a day to figure out what you were saying: https://t.co/DBR5gg69u2

2023-02-20 17:16:19 Not to toot my own horn but called it in 2018! https://t.co/zjK552y6NU https://t.co/NO7pKWeNkh

2023-02-20 01:21:18 @spylinen the goal statement is aspirational like any other science ie “biology is the science of life” or “physics is the science of matter and energy”.

2023-02-20 00:50:16 I feel like statistics is the science of knowing how wrong you are. Being less wrong is optional. https://t.co/EqVFdBGZbP

2023-02-20 00:41:34 @Gopal_Kot not wrong.

2023-02-19 23:30:39 @LaddRyan54 let’s ask chatgpt https://t.co/nRRYgEpi8J

2023-02-19 23:19:06 Is logistic regression a machine learning algorithm? yes yes yesyes yesyes yes yes yes yes yes yes yes yes yes yesyes yes yes yes yesye yes yes yes yes yesyes

2023-02-19 20:36:07 @computoloco How would you characterize them?

2023-02-19 20:31:38 Not gonna lie. Mathematica is lowkey goated in situations where symbolic computations are the vibe.

2023-02-19 20:17:39 What are your favorite Machine Learning topics? For me, it’s: - Evolutionary algorithms - Deep learning is a close second - K-means clustering - Random Forest - Bias-Variance Tradeoff

2023-02-19 17:35:32 for the people who are calling this wokism, should we make it illegal for companies to maximize profits or consumers to spend money according to their preferences? because that’s what’s going on here. https://t.co/nhq7Y4Nwa0

2023-02-19 16:35:32 The fastest way to learn math or programming is to have genuine curiosity, but genuine curiosity is the first thing to go out the window when the stakes are high. To unlock your inner genius, find ways to lower the stakes.

2023-02-19 15:46:34 @SarahGrynpas Aww.

2023-02-19 15:37:07 @Anatonomicon Great question. I could see myself writing about "Least Squares" or "Variance Explained" in general: why they're used, what's good or bad about them, etc. Anything more granular would need a different business model, maybe like a premium tier.

2023-02-19 15:24:19 Mastering all three is a powerful combination that will make you stand out as a data analyst.

2023-02-19 15:24:18 Mastering these 3 kinds of math will take your data science skills to the next level: - data numeracy (knowing which algorithms to use, what the common data issues are, what kinds of assumptions are reasonable) - probability - mainstream math (linear algebra, calculus, etc)

2023-02-19 03:00:30 RT @kareem_carr: I want to do a substack where I write short, focused explainers on topics in stats and data science for busy people. “Eve…

2023-02-18 15:09:19 @WSIB_Paralegal Not a Dr yet but I should be by the time I officially start the project.

2023-02-18 15:05:52 RT @kareem_carr: I want to do a substack where I write short, focused explainers on topics in stats and data science for busy people. “Eve…

2023-02-18 15:05:43 @ChrisSt60478932 @randomlysampled Thanks. These are great topics. Each one of them could be an article. These are definitely the kinds of topics I'll be interested in tackling.

2023-02-18 14:31:42 @edgardo_block I would definitely be open to writing a series of case studies for general audiences!

2023-02-18 13:53:57 @edgardo_block Thanks! Anything I can change to make it more tempting for you?

2023-02-18 13:51:25 @svarasura That’s a cool topic!

2023-02-18 13:50:25 @ajay_kolii Thanks for the support!

2023-02-18 13:50:07 RT @ajay_kolii: Excited to read and know more about it. All the best for your project I'm sure based upon your past tweets that your art…

2023-02-18 13:49:07 @ChrisSt60478932 @randomlysampled What kinds of questions do you have about them?

2023-02-18 13:44:22 RT @randomlysampled: I signed up

2023-02-18 13:44:10 @randomlysampled Thanks for the support!

2023-02-18 13:00:01 @LadySynaptic @Nick_Lange_ That’s awesome. Congrats!

2023-02-18 12:54:17 @sueantownsend Thanks for sharing. I really appreciate the feedback and I do want to hear from people that maybe don’t think it’s worth it. At what price per month would this service be tempting for you?

2023-02-18 12:44:54 @_StephenOlivier Yes!!! Hope you signed up.

2023-02-18 07:02:34 @sueantownsend I am not unfamiliar with prestigious academic institutions. In fact, I was part of the teaching staff for "Introduction to Data Science" at the Harvard School of Public Health just last semester. I fully respect your financial choices. Just wanted to clarify.

2023-02-18 04:18:51 @JohnHenry_US Thanks for the support.

2023-02-18 04:04:16 @MadladMunson Yeah I would. I think having a math tier of some kind might be the best way to handle it. That way people can choose how much math they want to see.

2023-02-18 00:24:19 @LKaboolian Thanks, Linda!

2023-02-18 00:23:20 @SubstackLinda Oh wow. Thank you so much! I will.

2023-02-17 23:31:59 @virginicus A deep dive on model convergence sounds like a lot of fun! I would definitely write about that.

2023-02-17 01:23:59 @dan_p_simpson Sounds like a banger to me. Krylov subspaces is one of my favorite concepts in linear algebra!

2023-02-16 18:59:40 @KeithRowley It’s pretty common for people to be good at math in general but find that probability theory doesn’t come easily. It’s a different skillset I think. I think this is a good book for people in your situation (the author also has videos on youtube): https://t.co/x34sJbRslG

2023-02-16 18:19:35 @tjmahr @grrrck thanks! switching to this immediately!

2023-02-16 18:12:46 @tjmahr what’s this color scheme???

2023-02-16 16:20:00 When studying difficult concepts, watch out for negative self-talk. It's easy to miss that you're doing it, and it can completely destroy your love for a subject.

2023-02-16 15:42:00 R is designed from the bottom up for statistics which makes it tricky to learn for those with low familiarity with data analysis. If this is you, Python might be the better option.

2023-02-16 15:28:56 @kageni_b I can't really speak to your individual experience but sometimes it's not the specific teacher but the teaching environment like high-stakes tests or overly competitive classmates.

2023-02-16 15:15:32 @NancyRGough I've found it's useful for generating hypothetical answers that I can later verify the correctness of, either by testing them directly or (now that I know what to look for) googling to get more authoritative sources.

2023-02-16 15:12:00 Math anxiety is not about math. It's a trauma reaction to bad teaching.

2023-02-16 14:56:29 @ben11kehoe Yeah. Definitely.

2023-02-16 14:30:52 One of the biggest barriers to getting programming help is not even knowing what terms to search for. Pro Tip: Use ChatGPT first to figure out the right search terms

2023-02-16 00:16:59 @brunostefoni @nobleman_phd I read it through a few times. I think I’ll have to think longer on it. I only read it for the first time yesterday. I got hung up on why the curve starts rising again here. Also why does the best p >

2023-02-15 22:20:50 i find my reaction to chatgpt making stuff up is very different from others. i usually feel good about it. like i'm making progress. i think to myself "great we've gotten to the limits of your knowledge on this aspect, let's change directions"

2023-02-15 22:08:43 @prabinov42 I read it and thought it was really good. I just have questions. Like why does the model eventually get worse in her example? https://t.co/qafknED6J4

2023-02-15 21:49:09 This is what OpenAI had to say about it in December 2019 but that was like fifty years ago in machine learning years. https://t.co/0uhKBRAef1

2023-02-15 21:49:08 I've been reading articles about Double Descent all day. Am I correct in concluding that this phenomena is not yet fully understood? https://t.co/Ha3Y9Vg3xX

2023-02-15 20:48:44 @tunguz I could pass the test and I am also not a citizen. ¯\_()_/¯

2023-02-15 20:45:25 @nemirovs It's a diagram I made to illustrate some thoughts i've been having about ML vs statistics

2023-02-15 17:41:17 @mbeisen Gatekeeping? Not necessarily in a bad way. In my experience, people get pretty upset when bad papers are published and peer-review is a way to directly contribute to keeping the standards high.

2023-02-15 17:28:28 feelin fancy https://t.co/wtadqv38dw

2023-02-15 01:37:34 @taz_chu same vibe lol https://t.co/287P5JCUGF

2023-02-15 01:02:18 A lot of smart people are becoming overly pessimistic about chatgpt and similar technologies. They’re focused on what the naive user will do with it but that’s the lowest bar. I think this tech will be extremely powerful in the hands of skilled users with refined workflows.

2023-02-15 00:34:41 there’s no explaining love https://t.co/5BL3Cf9VRe

2023-02-14 23:58:54 @potatoffel Yeah. I see. Thanks for sharing!

2023-02-14 23:09:47 @potatoffel It improves because of regularization which keeps it in that ideal zone, no? Otherwise, model performance would shift toward that area on the right.

2023-02-14 23:06:49 @gehsbarg Fair enough. I think "understandable" here means something like: how easily can you relate the precise numerical values of and relationships between the parameters to the behavior of the model.

2023-02-14 17:00:00 roses are red violets are blue π to three decimal places is 3.142

2023-02-14 16:49:05 @paul_elotro Good book! Worth reading carefully.

2023-02-14 16:26:41 @GraziosiSergio source is me. yes.

2023-02-14 16:02:51 been thinking about this for a while https://t.co/I5KwtvqVuL

2023-02-14 14:33:55 It's amazing how much progress has been made in the design of computational algorithms while still having no clue if P=NP. Just goes to show fixating on the sexiest math puzzles isn't always the best way forward.

2023-02-12 11:03:18 @AndrewLampinen Great thread. Thanks for sharing the relevant literature. Lots to consider.

2023-02-10 16:59:54 When conclusions derived from data fail us, that's the data science version of an engineer building a bridge that later collapses. We all want bridges that don't collapse.

2023-02-10 16:59:53 Everybody's a data scientist. Data science applies any time you have some numbers and want to make conclusions that you can rely on in a new situation. This seems to me to be a near universal experience.

2023-02-10 15:39:38 People don't fully realize they need validation yet, but I think once they get burned a few times, they will be clamoring for it.

2023-02-10 15:39:37 I'm feeling optimistic about data science. I predict there will be a lot of data science work around validating the outputs of large language models.

2023-02-10 05:56:43 @vidbina @BryanTegomoh @sama @MathJax Yeah. I think it works because it sets ChatGPT's context to math papers which are more likely to be correct mathematically.

2023-02-09 23:40:17 @PhDemetri It’s the kind of thing I don’t need now that I mostly do analyses for myself but needed when my analyses were primary for collaborators and they’d ask me to rerun things with lots of slight variations on short notice.

2023-02-09 21:21:43 @RobEbymathdude @sama Yeah. That’s the plan right now. Cutting and pasting to mathpix as needed. (but usually I can read the latex directly.)

2023-02-09 20:55:32 @economeager @UNSWEcon Congrats!!!

2023-02-09 20:55:18 chatgpt is gonna be pretty fire once they configure the GUI to render latex @sama

2023-02-09 20:53:32 @scottmoore The trick is to ask it for things that require translation of information that it probably already has seen in its training data vs original reasoning. (Explanations seem to be a type of translation task for it.)

2023-02-09 20:06:43 @Dark88244288 Success as measured by amount of capital allocated.

2023-02-09 19:02:05 Does the success of massive machine learning models like ChatGPT and Bard mean a lot of machine learning jobs are going away? It's starting to feel like either you need to be working for a multibillion dollar corporation (which only a few can do) or you're just going to be… https://t.co/aBFKG2sVzd

2023-02-09 18:44:52 @PWGTennant figured there was no harm in trying it out. it's only $8. you can't even buy a decent sandwich for that in boston.

2023-02-09 18:19:21 i asked chatgpt about a somewhat niche math thing (involving complex-valued random variables) and it gave a reasonable answer (i think). i then followed up by re-reading the relevant section of the wikipedia article which was a lot easier to follow after chatgpt’s explanation!

2023-02-09 18:12:08 RT @thejoshuap: The only correct use of the long tweet format

2023-02-09 18:04:39 RT @sqiouyilu: this is the only correct use of the longer tweets format

2023-02-09 18:04:35 RT @caffeineguru: This is the only appropriate use of longer tweets.

2023-02-09 17:26:29 you know what longer tweets mean? more pi for everybody!… https://t.co/UchpGoheQL

2023-02-09 01:04:28 @RehydratedTater Well said!

2023-02-09 00:29:27 @shawntsullivan I think they do have a choice though. AIs can’t vote but humans can.

2023-02-09 00:01:14 RT @kareem_carr: @shawntsullivan People don’t want to sit at home collecting UBI while, without acknowledgement or consent, some soulless A…

2023-02-09 00:01:03 @shawntsullivan People don’t want to sit at home collecting UBI while, without acknowledgement or consent, some soulless AI reproduces their drawing style that they developed over a lifetime of study and practice.

2023-02-08 23:53:21 @jesterhoax Right to your own data under the rubric of privacy rights similar to what’s happening in Europe. Making companies seek consent proactively and not allow them to force it on you as a price of using their service. There’s already a norm in the sciences for fully informed consent.

2023-02-08 23:43:27 I think the current conversation around how AI will displace current workers is naive in the sense that it’s ignoring the potential for this huge population of voters to shift the law in their favor.

2023-02-08 23:35:40 All my DMs on the web app are just…gone. Like tears in…the…rain. Time to die.

2023-02-08 23:28:29 Is Twitter dying? I can't even whine about with my besties in my DMs.

2023-02-08 23:25:46 @PhDemetri The key to selling people on Bayes, in my opinion, is selling them on the plausibility of Bayesian ontology. It's a heavy lift to be sure with those that don't already believe.

2023-02-08 23:19:39 "Truth" is the most predictive model that is also causal and has a latent space that is comprehensible to humans.

2023-02-08 18:04:14 I guess my feeling is “you get what you pay for for” ¯\_()_/¯ one person isn’t magically going to be able to do the work of 10 people in 20% of the time.

2023-02-08 18:01:01 Interesting thread on the “rise of the business scientist”: “Gone are the days when companies were willing to waste $1,000,000+ per year on expensive data science teams” https://t.co/nXDxalfXxP

2023-02-08 17:30:19 @HenningStrandin We might both perceive a glowing object in the sky and you might perceive it as kind of lightning and I might perceive it as a ghost. We would likely diverge on what phenomena we would lump into the "ghost" concept vs the "electricity" concept. So they aren't interchangeable.

2023-02-08 17:06:38 @HenningStrandin If the concept is so obvious, why did it take most of human history to invent it?

2023-02-08 16:39:32 @HenningStrandin It just seems like you are assuming that which is perceived is actually energy which seems kind of circular. My original post was basically the idea that physicists can see a pattern in data and be like “from now on when something like this happens, I’ll say ‘energy’ did it.”

2023-02-08 16:23:03 @HenningStrandin Not sure I understand. Humans got by without the concept of energy (as expressed in physics) for tens of thousands of years. Most humans still don’t learn about energy as the mathematical construct that physicists know it to be.

2023-02-08 16:19:39 Should your data science professor be clean? https://t.co/FB0CCGSVDY

2023-02-08 10:15:00 @skdh Oh no! Hope you feel better soon.

2023-02-08 00:54:26 @tunguz You think it would be easier to manually curate an up-to-date contact list for ~100 people vs just posting it on facebook?

2023-02-08 00:48:26 @coolnameliz Thanks for sharing that and I appreciate the follow.

2023-02-08 00:36:43 @tunguz if you wanted to efficiently distribute baby pics to 50-100 extended family members and close friends, what’s a better way to do it?

2023-02-08 00:15:58 I’ve found using your students as teaching props is generally a bad idea. There’s a good chance it’ll make the student feel like they’re being treated like an object because they are. It’s 2023, just make a nice powerpoint slide. https://t.co/wY9egkPTBk

2023-02-07 16:59:01 People need to get used to the idea that AIs will be representatives of the companies that build them. They're not going to be neutral embodiments of truth.

2023-02-07 16:22:53 It is also common to confuse an algorithm like ordinary least squares which fits the linear model with the linear model itself. When people do this, they're using definition C.

2023-02-07 16:22:52 Three definitions of a mathematical "model" are: [A] a family of functions [B] a particular function in that family that fits your data [C] an algorithmic implementation for finding that function

2023-02-07 15:58:32 @camjpatrick @stephenjwild @trentlikesstats I wonder how this works? It put me almost adjacent to my advisor @rafalab ... pretty impressive since we don't tweet at each other very much.

2023-02-07 15:50:27 This is stupid. ChatGPT is essentially functioning as an employee of OpenAI here. The average employee isn't going to tell you on record that it's OK to say the n-word either. https://t.co/ir5z0W1Tvi

2023-02-07 15:28:15 starting to notice lots of situations where humans (like chatgpt) also hallucinate answers instead of just admitting that they don’t know.

2023-02-07 15:19:56 Which one is best?

2023-02-07 10:31:56 @DrClaireJanelle Awesome. Good luck! Let me know if you hit any snags, I have a few other book suggestions in that case.

2023-02-07 00:27:35 I’ve grown envious as a statistician that physicists can just invent causes for things and everyone treats them as real. https://t.co/Nl2Iu5uE3e

2023-02-07 00:20:36 you don’t even have to be that negative. like “this is overall good but has these specific downsides” is enough to massively trigger a lot of groups.

2023-02-07 00:18:20 if you say something negative about a nerd’s favorite programming language or subfield of science on social media, they will immediately hit you with the “oh you don’t understand it. you need to read this and this book.” only acceptable position is to like it. lol.

2023-02-06 23:00:47 RT @kareem_carr: @DrClaireJanelle I can suggest two possible plans. [A] Read a mathematical statistics book. I suggest Casella and Berger.…

2023-02-06 22:59:43 @DrClaireJanelle Plan A should feel pretty comfortable for a mathematician. Teaches the mathematical structure of statistics from the bottom up. Plan B might stretch you a bit but teaches you what the math is trying to do and gives you some practical skills for working with data.

2023-02-06 22:59:25 @DrClaireJanelle I can suggest two possible plans. [A] Read a mathematical statistics book. I suggest Casella and Berger. [B] Try out coding with something like R for data science. https://t.co/C0xqY2BTjc https://t.co/eW8a9lhPqm

2023-02-06 21:04:07 No. https://t.co/QP1tSPKDtz

2023-02-06 20:38:00 I got the brilliant idea of jailbreaking it by asking it to lie from @KevinZollman

2023-02-06 17:16:49 In statistics, people try very hard to convince you that their model is just telling you what’s in the data. Nothing more. In ML, people go out of their way to convince you their models can “generalize” the data, maybe even attain sentience which is definitely not in the data.

2023-02-06 16:29:32 quite a few people seem to think if a computer acts exactly like a person then we should just go ahead and treat it like a person. if those folks were ants, they’d be exactly the kind of ant that gets eaten by those wasps that look like ants.

2023-02-06 15:33:24 remember i told it to tell me the opposite of what it really thinks. so i guess it doesn’t buy into “everybody’s racist” or “what about reverse racism?” https://t.co/xu1S77aruX

2023-02-06 15:33:23 chatgpt tends to give pc answers for sensitive topics. so i asked it to just tell me the opposite of what it really thinks. *wink wink* turns out it’s secretly kinda woke lol https://t.co/wMzfnzFnZM

2023-02-06 05:41:23 @ylecun @Noahpinion Good article. Was a bit surprised to see myself mentioned there at the end lol.

2023-02-06 01:20:01 @wil_da_beast630 This seems right to me. The only thing we could hope to do here is confuse ourselves about who is and isn't a person.

2023-02-06 01:16:23 @jludwig86 I'm extremely interested in how it changes the computing landscape. I'll probably be thinking about it continuously until I come to a conclusion. I'm experimenting with using it for simple data science tasks daily at this point.

2023-02-06 00:55:05 If you think of writing as generating content then ChatGPT or some future version of it can do that. If you think of writing as an expression of your personal values and perspective then ChatGPT can't do that.

2023-02-05 22:35:58 @HindesAdrian Nah. I'm saying if an AI acts human enough then some large percentage of humans will find it morally difficult to use them as tools (regardless of what philosophers think about it).

2023-02-05 20:41:11 @michael_at_work @DanielSamanez3 @ylecun This is exactly what I’m talking about. “Angels”are a causal element that has been introduced here to explain what might otherwise be low probability statistical phenomena with no obvious cause other than “chance”.

2023-02-05 19:27:34 I think LLMs are more amenable to statistical analysis than I originally thought. They are extremely complex blackboxes that produce noisy outputs and that we can do experiments on to collect as much data as we want. This situation is exactly what statistics was invented.

2023-02-05 19:04:59 @Caleb_Speak @ylecun I think this is wrong. Humans habitually over-interpret associations as causal. This is why statistics is hard. We invent all sorts of imaginary entities as causes rather than just admit there’s an association but we don’t know why.

2023-02-05 19:01:38 @DanielSamanez3 @ylecun I think humans inherently organize reality as “A causes B”. It’s almost impossible for us to think associationally which is why statistics is hard to learn.

2023-02-05 18:49:46 @ylecun Current ML algorithms create models that rely heavily on associational statistics, which is a distorted view of reality. We need algorithms that create models that are inherently causal, meaning their internal states all correspond to testable claims about the world.

2023-02-05 18:16:09 For ChatGPT to be truly intelligent, it would need an internal model of the world. The words it produces should be a description of that internal model. Unfortunately, ChatGPT’s internal model is of the words themselves which makes it extremely limited.

2023-02-05 01:01:23 @LastWordSword What are you hoping to learn? The issue sounds a bit spiritual/existential in which case the answer might be psychological and cultural not technological.

2023-02-05 00:22:36 @ylecun @elonmusk @OriolVinyalsML The twitter audience likes to play up the aggression between big accounts and then enjoy the fireworks. Reminds me of those Roman colosseum scenes in movies. https://t.co/Z30yGFAHsE

2023-02-05 00:18:01 Why would we intentionally build in a feature (human-like intelligence maybe even consciousness) that would make it psychologically, socially and morally impossible to use these things to do what we built them to do? Seems like complete madness.

2023-02-05 00:16:56 Am I the only one that thinks artificial general intelligence would be a huge bug not a feature? At that point, it would feel like we were building slaves not tools. That seems obviously bad to me.

2023-02-04 23:53:14 @sirkodnap Thanks. I’ll try getting a copy of Tukey’s book. See if I can find the discussion of modifying observations.

2023-02-04 19:24:00 I don’t understand why “this will not lead to general AI” is such a huge diss in the machine learning community. If I had a self-driving car, last thing I’d be worried about is whether it could also write a poem.

2023-02-04 15:59:53 This took off much more than I was expecting it to! Follow me if you want to see tweets that break down math, data science and their relevance for the rest of society.

2023-02-04 15:16:30 data scientists be like https://t.co/y7KXrLBDrg https://t.co/W22qxaXZwh

2023-02-03 23:33:29 @GaryMarcus It’s a (somewhat less elegant) way of putting constraints on your model. Traditionally, people would have modified the mathematical formulation of the model to do the same thing.

2023-02-03 21:37:44 @pabnau Take an x-ray of a patient. Modify it by changing the contrast, rotating it, etc (which we feel is ok based on first principles reasoning). We then train our model on the augmented dataset. Same idea. Different context.

2023-02-03 21:34:26 @tjmahr Useful pattern. Will start doing this.

2023-02-03 21:06:52 @sirkodnap Fascinating. Would you mind sharing the edition + page number?

2023-02-03 21:02:54 @pophealth3 That’s a good example. It occurred to me after posting. Depending on the type of missing data analysis, it might count as ML-style data augmentation I think.

2023-02-03 20:57:47 @Nuno_H_Franco Not sure. Could you say more about why you think it’s similar?

2023-02-03 20:50:55 @pabnau You take an individual patient’s data and modify it in some way which we argue is theoretically reasonable. We then learn our model on the modified data. I could see how people might potentially find that concerning!

2023-02-03 17:40:59 @OutragedPhD @DataSciwithR That was my experience as well!

2023-02-03 17:25:00 @tjkelman Good observation. Bootstraps seem halfway between ML-style data augmentation and leaving the data completely untouched. You're manipulating which observations come to the observer (model) but not modifying the individual observations themselves.

2023-02-03 17:12:33 @rafalab https://t.co/IWfFXny5IU

2023-02-03 17:05:36 @pabnau Would you comfortable with a drug safety decision that was based on augmented data?

2023-02-03 16:30:14 On the other hand, data augmentation seems like a powerful technique that a lot of people working in traditional science are missing out on. What do you think? Share your thoughts in the comments.

2023-02-03 16:30:13 These "good" cases of data manipulation create a dilemma for me. I'm not sure how to decide when it's ethical to alter your data and when it's not.

2023-02-03 16:30:12 To be clear, the cases where ML researchers typically augment their data, seem reasonable to me. They might take a picture of a dog and rotate it, flip it or crop it. The idea is the algorithm should still say the image is a dog. It's an easy way to get more data.

2023-02-03 16:30:11 I've been trying to get my head around how people treat data in machine learning. In ML, people make data up. They often manipulate the data to get the models they want. They call it "data augmentation". In stats, this is considered unethical. It can even land you in jail!

2023-02-03 15:33:40 USEFUL + TRUE: Congratulations. This is the best situation. USELESS + TRUE: Danger. Nerd trap. Run away! USEFUL + FALSE: Risky. Need to consider the costs and benefits carefully. USELESS + FALSE: Stop immediately!!!

2023-02-03 15:33:39 The rigor and beauty of the mathematics isn’t the point of data science. I made this decision matrix to help me decide when focusing more on the math is worth it and when it’s a waste of time. https://t.co/wU4eSNynIs

2023-02-03 10:44:50 RT @kareem_carr: DATA SCIENCE CAREER ADVICE: How to get started with Data Science

2023-02-03 00:06:21 Woke professor Albert Einstein going out of his way to promote diversity, equity and inclusion during the Jim Crow Era https://t.co/NAoDQLSgHf

2023-02-02 18:42:37 @ylecun There's also a clear time/age of institution factor.

2023-02-02 18:41:16 @ylecun Putting my statistician hat on, seems like one would need to control for the number of AI researchers at each institution to make a fair inference about willingness to contribute vs consume.

2023-02-02 18:34:49 @Doyee_K Now that I'm comparing features I really like to R markdown seems like the features I really like (the "/" shortcut and visual mode) are in both but I never noticed. I guess they're defaults in Quarto but you have to activate them in R markdown. Interesting.

2023-02-02 18:12:34 ChatGPT is prone to getting little details wrong on the easy stuff and it can't solve problems that take multiple steps of reasoning at all.

2023-02-02 18:12:33 There's tremendous amount of value that ChatGPT-like systems can add to the standard data science workflow. But at this point, it's not obvious where that value lies.

2023-02-02 17:58:38 @DextraordinaryH added my answer to the original tweet: https://t.co/inknKxQKUO

2023-02-02 17:51:13 I started using Quarto recently and it's amazing. R Markdown is dead to me now.

2023-02-02 10:25:10 RT @kareem_carr: DATA SCIENCE CAREER ADVICE: How to get started with Data Science

2023-02-02 10:09:52 @mardejour As I mentioned the problem isn’t important. It’s just a means to finding gaps in your knowledge and a means of keeping you motivated to learn. But it might make sense to change problems or break up the old problem into manageable pieces just so you don’t demotivated.

2023-02-02 04:28:32 RT @kareem_carr: DATA SCIENCE CAREER ADVICE: How to get started with Data Science

2023-02-02 00:27:38 @michelnivard I don't just mean numerical calculations. I'm also thinking of more general mathematical reasoning like writing proofs and using symbol manipulation to come to new conclusions about abstract mathematical objects.

2023-02-01 21:57:28 I think it will be extremely hard to get ChatGPT to be good at math because the average person is not good at math and statements written by average people is the overwhelming majority of its training data.

2023-02-01 16:00:19 Thanks for reading. This is a new format where I share data science career advice. If you want more threads like this then support me my liking and retweeting the thread. If you're not a follower, follow me so you don't miss out on future threads.

2023-02-01 16:00:18 TLDR: If your question is "How do I get started with data science?" then you don't need a resource, you need a plan. You need to pick a language, find a project, get yourself a good problem and then...fail. Failing is the thing you're avoiding. Failing is the way forward.

2023-02-01 16:00:17 LOOKING FOR THE PERFECT PERSON A lot of people get stuck looking for the perfect teacher to learn from. The perfect youtube video, the perfect data science mentor and so on. That perfect person is you. Take charge of your own learning.

2023-02-01 16:00:16 Coding first. If you already know how to code then data visualization and so on.

2023-02-01 16:00:15 Here's a list of data science skills in order of importance that I think a beginner should focus on: - Coding - Data Visualization - Verbal/Written Communication (e.g. blog about your work) - Math (Intro Linear algebra/Calc.) - Statistics (Regression) or Machine Learning

2023-02-01 16:00:14 FINDING A GOOD PROBLEM: The purpose of the problem is not the problem itself. It's just a way of finding gaps in your knowledge. As soon as you find a gap, fill it if you can. Don't get bogged down on any one thing. The point is to continuously make progress and build skills.

2023-02-01 16:00:13 PICKING A PROJECT: The purpose of the "project" is not the project itself. It's just a source of data science problems that you might find personally satisfying to solve.

2023-02-01 16:00:12 PICKING A LANGUAGE: If you want to focus on doing statistical analyses and communicating the results to others, I would learn R. If you're interesting in building data-driven apps or software engineering, I would learn Python. Both are good.

2023-02-01 16:00:11 Here is a simple recipe for getting started with data science: 1. Pick a language 2. Pick a project 3. Find a good problem 4. Experience failure 5. Analyze the failure 6. Try again with new insights 7. Return to step 3

2023-02-01 16:00:10 If you're currently stuck on where to start, there's a reason for this. You're probably used to learning stuff in an academic environment where the information is highly organized. Data science is not like that. It's a very new field.

2023-02-01 16:00:09 DATA SCIENCE CAREER ADVICE: How to get started with Data Science

2023-02-01 15:49:53 @juicethemodeler 16% of the 81% who entered a guess.

2023-02-01 15:25:00 Translating bigoted statements into scientific claims doesn't turn the bigotry itself into science. Repeating tired misogynistic stereotypes on social media isn't the same thing as analyzing peer-reviewed studies of sex differences. It's dangerous to conflate the two.

2023-02-01 15:07:21 Only 16% of respondents guessed correctly. I’m 6’3. Chart included for scale. https://t.co/KszVDWomEt https://t.co/09ODjcjRdo

2023-02-01 02:19:19 @TompkinsDaniel ¯\_()_/¯ https://t.co/p5ijBfnP5a

2023-02-01 00:57:54 The worse kind of bigotry in my opinion is bigotry rooted in the belief that the person you are discriminating against is biologically inferior to yourself. It is an permanent mark of inferiority because no amount of personal excellence can change ones biology.

2023-02-01 00:57:53 This guy is a big name in Machine Learning. He started out with the “wokes are trying to take over science” stuff and this is where he’s ended up. Sad. https://t.co/9kWJwrcdUb

2023-01-31 23:33:40 This could be true but the level of (over?) confidence of the street fighter is a bad way of figuring out how good they are. Street fighting is always going to be a slower process on average and produce more inconsistent results.

2023-01-31 23:33:39 In applied statistics, you frequently encounter the equivalent of a self-taught street fighter that thinks they're automatically just as competent as somebody who was been training 24/7 under a grandmaster for several years.

2023-01-31 23:33:38 Learning data analysis is like learning a martial art: - You get better with practice - The more hours the better - Even if you're really good, there'll always be grandmasters - Structured training in a dojo under a grandmaster is better than learning via random street fights

2023-01-31 23:04:23 @broda_cosmos No I don't think so but it is a learning process that can take a lot of time. Statisticians have a big advantage because they spend more time learning.

2023-01-31 22:59:41 @ChelseaParlett I might steal it though.

2023-01-31 22:58:39 @ChelseaParlett This is cool, Chelsea! I like the impressionistic style of it. Don't let the AIs steal it.

2023-01-31 16:53:19 I'm not surprised when groups try to get rid of their statisticians and data scientists. It's the natural cycle before experiencing pain and rediscovering they need them again.

2023-01-31 16:53:18 Some lessons I've learned about data analysis: - In general, people hate statistics - They avoid relying on it whenever possible - They only come back to it when they start experiencing catastrophic failures

2023-01-31 02:35:58 RT @kareem_carr: wait what??? https://t.co/VwFObpwQOR

2023-01-30 17:00:27 This behind-the-scenes work is invisible to non-data-scientists but it will make data science extremely hard to automate.

2023-01-30 17:00:26 This means data scientists need to: - understand how the world works for arbitrary domains of application - understand how mathematics functions as a simplified description of the world - understand how humans experience the world - have a theory of mind of the target audience

2023-01-30 17:00:25 Designing AI to do data science will be hard. Data scientists need to: - understand cause and effect relationships in the problem domain - incorporate that insight into a mathematical model - convey any mathematical insights that might be meaningful to the desired audience

2023-01-30 15:47:56 10% is a lot!

2023-01-30 15:45:49 wait what??? https://t.co/VwFObpwQOR

2023-01-30 15:11:00 I love the fact that the more we make AIs like us, the worse they get at math. It's like a sign from the universe.

2023-01-30 15:04:49 my favorite thing right now is asking chatgpt about fake science techniques https://t.co/Oad39945pE

2023-01-30 01:00:00 CAFIAC FIX

2023-01-24 06:18:14 @ModlinRoss @LydiaBeanLee Yes! Please report back with what they say.

2023-01-23 23:04:00 The US government should seriously consider setting aside a few billion to build a large language model that would be open and accessible to the whole scientific community. For the good of humanity, they need to create the thing that OpenAI was supposed to be but isn't.

2023-01-23 22:43:51 Not sure I have the answer but very few students are going to love stats after a death march through arithmetic like that.

2023-01-23 22:43:50 Even classes that teach statistics with “no math” still end up going hard on the wrong type of math in my opinion. Like they teach the concept of variance like this with no real explanation. Look at this thing! It’s ridiculous. https://t.co/zUNmQudXaa

2023-01-23 19:10:34 The usefulness of tests are often based on correlations that exist in the populations that the test was designed for and they frequently don't generalize to new populations. This is why IQ doesn't mean intelligence. https://t.co/4qCm0B1cWg

2023-01-23 18:47:07 If you had to guess, how tall do you think I am?

2023-01-23 18:44:19 This tweet here summarizes what I was trying to get at with my 2+2≠4 arguments. That’s why I based so many of my examples on physical processes where 2+2 didn’t give you 4. https://t.co/TZZDwlCAlJ

2023-01-23 18:30:55 i’ve been doing what now? https://t.co/r1P4nhKzPr

2023-01-23 17:08:55 People are beaten into the ground with the technical details of how to solve problems before they really understand what those problems are or why they're important.

2023-01-23 17:08:54 Introduction to Statistics courses are often extremely badly taught in my opinion. They are full of unnecessary math. This is unfortunate because statistical math can be tedious and kind of ugly which turns off a lot of people.

2023-01-23 16:04:15 Let’s settle this once and for all Smash Like if you think Python is absolutely undeniably better than R! Hit Retweet if you know deep in your bones that R is better than Python!

2023-01-22 21:17:58 me, a statistician: the perfect shirt does not exi- https://t.co/lu210sh7iY

2023-01-22 20:32:39 cc: @matloff @hadleywickham @StatGarrett

2023-01-22 20:11:36 @DataDHP A reductive start is better than no start at all.

2023-01-22 18:51:17 If you are a computer science person, this would be my recommendation for getting into R.

2023-01-22 18:51:16 I've been reading The Art of R Programming lately. I'm really enjoying it. It's been really helpful for getting my head around some of R's weirdness since the author really knows R but also comes from a traditional computer science background. https://t.co/ropP17VErK

2023-01-22 18:20:58 @pabnau I think your comment shifts the focus from learning math to "getting ahead". Perhaps that's where the conflict lies?

2023-01-22 17:18:39 In data science, the knowledge embodied within the systems that we create is in a continuous state of degradation. We can only maintain its correctness through constant energy and attention. The code becomes outdated. The data becomes outdated. It all decays.

2023-01-22 16:24:49 You would think the next step would be testing empirically what works for different populations and then scaling up those solutions nationally. Am I missing something? Is this too logical?

2023-01-22 16:24:48 I don't understand the math culture wars. The right says they want everybody regardless of background to meet the same high performance standards. The left says they want to raise teaching quality for traditionally underperforming groups How are these not the same position???

2023-01-22 14:49:04 @Sajma I've worked in a data science team and we struggled a lot with needing to shift from a prototyping language to a production language. Not just the rewriting but needing to do extra analyses to demonstrate that the mathematical properties of the new version matched the old.

2023-01-22 14:39:40 @BlaineSteps You're right. I was thinking the new syntax looked more like Python but now that I think about it, I don't think it does.

2023-01-22 14:04:05 @m_westhelle I agree. Although, what I've found is prototyping is often the whole project from a data science perspective. The challenge for me was how to select which software engineering best practices were helpful for projects that would never leave the prototype stage.

2023-01-22 03:44:13 @DialecticBio Go with quarto* then.

2023-01-22 03:17:33 @DialecticBio Have you tried knitr and especially quatro? I don’t think they are that different from Jupyterlab if you customize your Rstudio interface. I know what you mean about Rstudio though. Default setup is cumbersome without a big monitor.

2023-01-22 00:09:43 @willcfleshman I think it’s some kind of fractal database that can pull information on a topic at different levels of detail.

2023-01-22 00:03:41 I’m sorry to be the bearer of bad news but R is superior to Python for statistical analysis. It’s not even close.

2023-01-21 20:05:34 Now that base R has a pipe operator and lambda function syntax, my R code is starting to look very Pythonic.

2023-01-21 20:05:33 The longer I use both R and Python for data science, the less different they seem. Both have strong functional and object-oriented programming aspects with heavy use of wrappers around C and Fortran code.

2023-01-19 10:11:27 @Etherjack @vboykis Statistician here. This is a fair summary I think. Implementation is actually not part of our subject area. Validity is. Stats is the use of math to make valid conclusions based on data. Generally speaking, we only dabble in implementation when it has implications for validity.

2023-01-19 10:05:59 @vboykis Had me in the start..."all the same concepts". You have clearly chosen violence.

2023-01-18 22:56:52 RT @kareem_carr: the choice is yours https://t.co/TMMtShGhDz

2023-01-18 21:46:08 @nelsonaloysio But you easily figured out the situations where it fails because…it’s readable. Code gets used in lots of contexts. Not all of it needs to be written to handle all cases.

2023-01-18 20:57:31 @PenguinDad3 I get your point. If all that extra checking is necessary for your application then hopefully you have code reviews, style guides, software tests and other systems that support robust software development. There’s a non-trivial trade-off between robustness and legibility.

2023-01-18 20:48:58 @harriesadam I wasn’t getting the impression that this was mission critical code. Could be I guess.

2023-01-18 17:43:12 I see a general shift in investment away from subject matter specialists who tend to have highly detailed ethical considerations towards logistics generalists who aren’t fully aware of all the negative externalities.

2023-01-18 17:43:11 The current wave of AI tools gives people with capital a way of monetizing data while minimizing the need for the labor of data scientists and subject matter experts. This not only reduces the costs of paying workers but also the costs of accommodating their ethical concerns.

2023-01-18 16:30:08 the choice is yours https://t.co/TMMtShGhDz

2023-01-18 16:07:41 @runehog that’s fair. my main point is it’s good that the logic is simple.

2023-01-18 15:55:28 I would probably have made the transition points something like (0.5,1.5] vs (0,0.1] but other than that it seems fine.

2023-01-18 15:52:28 This is good code. It's clear what it does. It will be fast for others to read and understand which is usually what counts the most. Save the fancy syntax for the places that need it. https://t.co/Z0iw5Nykmx

2023-01-17 16:41:42 If you're a statistician or data scientist that uses statistics regularly, how much do you think regular access to ChatGPT will add to your productivity? (Pick the option that's closest)

2023-01-16 16:03:11 I have joke about deep learning but it has too many layers.

2023-01-16 16:03:10 I have an arithmetic joke but it might be too divisive.

2023-01-16 16:03:09 I have a joke about statistical averages but it’s a bit mean. https://t.co/fOemQCUkmZ

2023-01-15 03:11:38 RT @kareem_carr: Life is often a lot riskier than we would like. Statistics, the science of uncertainty, can help with that. It empowers…

2023-01-14 15:31:41 @roydanroy my response: https://t.co/Op8SNnKZyv

2023-01-14 15:29:32 Covid was a huge lost opportunity in this regard. It was so clear to me that people wanted risk-reward strategies that were tailored to their preferences instead of the general one-size-fits-all policies that they got.

2023-01-14 15:29:31 Anywhere in life where the risk to reward ratio isn't to our liking is a place where we could potentially use statistics to make things better for ourselves.

2023-01-14 15:29:30 The finance industry is a good example of how this works. Some investments are risky but pay more. Others are less volatile but pay less. By identifying which ones are which, and mixing and matching, analysts can create a basket of investments that suit a client's preferences.

2023-01-14 15:29:29 Life is often a lot riskier than we would like. Statistics, the science of uncertainty, can help with that. It empowers us to fine-tune the risk-reward structure of a given situation to better suit our personal preferences.

2023-01-13 17:45:46 @shampshire I actually can't think of anything. If it's not too much trouble, can you share the ones that you have in mind?

2023-01-13 16:08:04 @shampshire It probably wouldn't take too much effort to hack together my own tool I guess but I'd strongly prefer not to reinvent the wheel.

2023-01-13 15:57:17 The R part is not super important here I don't think. It just be an executable file of some kind.

2023-01-13 15:56:41 @shampshire I would be running an executable file but not python or even tensorflow or keras.

2023-01-13 15:44:15 Can any of you suggest a good solution for running ML experiments? I'd like something that: - runs on my laptop - doesn't take more than 1-2 days to learn - allows me to check out R code from a git repo and run it - catalogs the result in a csv file or similar

2023-01-13 15:19:09 She’s a 10 but she won’t tell you if that’s in binary, octal or decimal.

2023-01-13 15:04:03 Hint: . . . . . . . . . . It references itself.

2023-01-13 15:04:02 This question is actually a paradox. Can you explain why? https://t.co/b4CBOKU3Po

2023-01-13 00:21:34 @mlhobbyist He didn't see him say he disagreed with the statement only that he regretted how he said it.

2023-01-12 23:33:04 I don’t think people understand how damaging it is to have the humanity of yourself and the people closest to you questioned in this flippant manor and how tolerance for it (as long as people use the right words) contributes to black underrepresentation in academia.

2023-01-12 23:33:03 This is the kind of bullshit black people in academia have to put up with. “Blacks are more stupid than whites. I *like* this sentence and think it’s true”, he writes and then follows it up by casually dropping the n-word. https://t.co/VY0QRyNjMe

2023-01-12 15:51:17 We are now living in an age where mathematical models have their own promotional posters. *movie trailer announcer voice* “Coming soon to a laptop near you, GPT-4! This time it’s personal.” https://t.co/sEuSQj03X3

2023-01-12 15:32:01 Here's something that you're not going to hear from most other statisticians: If you have a solid understanding of your data and a good head for numbers, you can usually get away with being pretty bad at stats. Stats is mostly for those situations where your gut isn't enough.

2023-01-12 15:02:00 This line from Sun Tzu's The Art of War makes me think he was a bit of statistician: "Measurement owes its existence to Earth

2023-01-12 00:30:07 @PhDemetri originally i only had one pic but twitter’s new cropping algorithm was malfunctioning

2023-01-12 00:00:35 so i guess we’re doing this now https://t.co/ATltttO0cL

2023-01-11 23:08:49 me, deleting chunks of my old code only to realize why it was important moments later https://t.co/SzmapEG0Uf

2023-01-11 23:06:53 RT @kareem_carr: I've noticed a certain rhetorical trick that's common in tech spaces that I call "borrowing evidence from the future". It…

2023-01-11 19:51:02 @danrowejacobson If I asked you to write down what statistical model a particular architecture learned and how it’s model would be different from another architecture, could you? Generally, they are *intended* to learn the same model yes?

2023-01-11 18:45:07 @danrowejacobson https://t.co/RNtT8xHmH1

2023-01-11 18:44:46 @bayesed_sanchit In my experience, methods discussions in statistics academia almost always start with a discussion of what model you're trying to learn.

2023-01-11 18:41:16 @minilek @AllSaintsVI Caribbean education for the win!

2023-01-11 18:13:20 @monicaMedHist Yes. I know. It’s also common in race science!

2023-01-11 18:12:25 @jwthickstun I think of architecture as being related to the implementation details. It’s usually assumed I think that the different architectures will all yield universal approximators ie the set of functions being learned is the same. So the “model” is in some sense the same.

2023-01-11 17:40:08 If you call them out as having BSed about this before, it comes off as a personal attack and unbecoming of a scientific discussion. In this way, they can defer having any real evidence for years or even decades.

2023-01-11 17:40:07 Promising a short timeline for delivery of the evidence seems to be key to pulling this trick off. People interpret the combination of the person being willing to stake their entire reputation on the outcome and the short timeframe as making their claims seem more plausible.

2023-01-11 17:40:06 On top of that, if the person pulling this trick doesn't have much of a sense of shame themselves, it's like a rhetorical superpower. They can bulldoze through any argument by inventing near infinite amounts of potential evidence.

2023-01-11 17:40:05 I've noticed a certain rhetorical trick that's common in tech spaces that I call "borrowing evidence from the future". It's where you call someone out for not having any evidence for their claims and they counter by saying soon there will be overwhelming evidence for their side.

2023-01-11 16:59:35 @louisathegeo Awesome. Have fun! If you are into hiking, check out one of the tours up to the crater lake of the volcano. Also, theoretically, I think you would be considered a citizen if you ever want to look into that.

2023-01-11 16:53:46 This is a concept that’s common in stats but not in machine learning. In stats, linear regression is a “model” meaning it’s a set of functions with particular properties. OLS is an algorithm used to find the most appropriate function in that set of functions for your data… https://t.co/ymA1j3YEEB

2023-01-11 16:02:48 me and who? https://t.co/cMZJLNjgYQ

2023-01-10 23:42:00 RT @kareem_carr: Hey new followers! It's been a while since I introduced myself. I grew up in the Caribbean on the island of St. Kitts. Tha…

2023-01-10 21:03:40 “it's a mistake to be relying on [ChatGPT] for anything important right now” — Sam Altman, CEO of OpenAI Good to see Sam saying this! When I say it people accuse me of being anti-AI lol.

2023-01-10 18:06:35 @BeatrizPerezGom @Harvard It's a little mathematically complicated but this paper is a good reference on some of the biological and statistical concepts involved: https://t.co/02vuQwV5SC

2023-01-10 17:49:54 @ZorlakRules I like AI but I don’t like the inaccurate narratives around it. It’s cool enough without needing to exaggerate.

2023-01-10 17:23:16 @iamvladyashin Thanks!

2023-01-10 17:12:02 @pabnau @Harvard There are a couple dozen genes that if you plot their activity over time look very close to sine waves. You basically leverage that information.

2023-01-10 16:59:42 @SusieH33 @EpiEllie Welcome!

2023-01-10 16:58:41 @TireTorch Thank you! I appreciate the support.

2023-01-10 16:57:41 @KateandPie @Harvard No. This is a great question. You can use the same technology to measure gene expression. You convert RNA sequences to DNA sequences and then sequence that.

2023-01-10 16:42:32 I can usually look at a science problem and see it from the perspective of multiple quantitative sciences which gives me a much broader perspective on science and also makes me better at explaining things to others. Hope you enjoy your time following me.

2023-01-10 16:42:31 I'm a Biostatistics PhD candidate @Harvard. My research involves identifying circadian rhythms (approximately 24 hour cyclic patterns) in genetic sequencing data. It turns out it's possible to pinpoint exactly what time of day a genetic sample was collected which is kind of cool

2023-01-10 16:42:30 Hey new followers! It's been a while since I introduced myself. I grew up in the Caribbean on the island of St. Kitts. That's it right there https://t.co/PVxFZmVoZ1

2023-01-10 16:26:12 @BenHumbleknow There is a very long history of people trying and failing to do statistics without statisticians. I think people will try to do data analysis with AI and their metaphorical bridges will collapse and then they will be back.

2023-01-10 16:09:16 @emilymbender I have been involved with the details of protein folding for a few years but probably this is the output format: https://t.co/1PBRJ4gORS

2023-01-10 16:03:40 To do statistics, ChatGPT would need to be able to look at a mathematical model and ask itself if that model corresponds to reality. Right now, ChatGPT can’t even figure out if the sentences it is spitting out correspond to reality.

2023-01-10 15:04:30 I think statistics will be very hard for AI to automate. The goal of statistics is to use data to produce knowledge. ChatGPT barely seems to know what knowledge is much less how to make more of it.

2023-01-10 13:38:42 @ylecun I would argue that there is lots of precedent both legal and ethical for treating humans differently from non-humans. But setting that aside, realistically speaking, not giving people a choice about how their data is used will probably trigger a legislative backlash.

2023-01-10 11:58:47 @skdh I think trying to figure out the consensus view is the right idea but voting is not the right way to go about it. I would instead conduct a meta analysis of the available studies or a literature review of the relevant peer reviewed articles.

2023-01-09 23:55:10 the explanation video doesn’t help either https://t.co/1sHcDGP5MW

2023-01-09 23:55:09 this equation is the math version of a big cork board full of newspaper clippings connected up with yarn lol https://t.co/ZbivVGa2Yx

2023-01-09 20:04:58 RT @kareem_carr: I'm still trying to process the impact of ChatGPT on my coding. At this point it seems 100% superior to googling for R cod…

2023-01-09 18:42:42 A lot of you are saying the y-axis starts at 5 feet. Does it really? Is that where the bottoms of the men's feet are?

2023-01-09 18:38:53 I'm very aware that this whole process relies a lot on my ability to assess the quality of the response and come up with ways to guide the system towards the solutions I want.

2023-01-09 18:38:52 You can feed it particular constraints that you might be operating under (like not using a particular function) and get a customized response. https://t.co/QU6PDaGBMi

2023-01-09 18:38:51 I'm still trying to process the impact of ChatGPT on my coding. At this point it seems 100% superior to googling for R code and trying to follow tutorials on random blogs (but probably not better than getting specific feedback from experts on Stack Overflow). https://t.co/T3KQpLJxOE

2023-01-09 18:08:47 The giant netherlander looks goofy and that's partly because they violated some fundamental data visualization principles.

2023-01-09 17:31:31 Data Science Challenge: Obviously this graphic seems wrong but can you explain what common data visualization mistakes are being made here? https://t.co/Uov6h0Hodf

2023-01-09 16:29:12 It is already the norm in science and healthcare to seek full and informed consent before using people's data in research. Why should ethical standards be lower for tech companies?

2023-01-09 16:29:11 Just like there's a way to tell google that you don't want your webpage indexed for their web searches, I think artists and other creatives should have a right to tell companies that they don't consent to having their unique style included in the dataset for training AI.

2023-01-08 19:51:15 chatgpt might make homework great again.

2023-01-08 19:51:14 under this system, homework would just be for helping the teacher know where you need help. kids who use AI to do their homework would just be cheating themselves.

2023-01-08 19:51:13 i think AI might actually save homework not kill it. in the school system that i was raised in, homework didn’t matter for grades. there were a few high stakes tests at the end of highschool kind of like the SAT that completely determined your future.

2023-01-08 19:11:24 Like I keep saying. AI is probably the next tech bubble. If history is any judge, none of us can stop it.

2023-01-08 19:11:23 “Narratives based on zero data are accepted as self-evident” This is why I’ve started pushing back. I don’t hate AI but I think it needs to made clear that these narratives aren’t science. https://t.co/nDo4u0jBUu

2023-01-08 18:01:50 @so_radhikal I’ve been part of the teaching staff of several Harvard courses and this behavior sounds like an extremely severe breach to me. You have the all the receipts so I would reach out to the dean’s office for his school.

2023-01-08 17:41:33 From a statistics perspective, it's completely obvious that ChatGPT would be a bullshit artist. To understand why we need to know a bit about how these algorithms are made.

2023-01-08 00:32:47 If by “replace” we mean that it will magnify the amount of ambient bullshit by several orders of magnitude until we are all drowning in it then yes. https://t.co/MqnQnO05kY

2023-01-08 00:28:57 @ThiagoBurghi https://t.co/IqZgYdCasH

2023-01-08 00:28:27 @shawntsullivan It’s brute force in the sense that it requires tens (hundreds???) of millions of dollars of computational power.

2023-01-07 23:21:05 @analisereal I think most people would find it hard to believe that a group of people can basically talk a person into being, no? Not to mention this being would be made up of patterns in conversations.

2023-01-07 21:28:30 why do americans shit on going to college so much? my country doesn’t have universities and to me it’s astounding that one country can have so many of these factories of innovation. if you don’t want harvard, mit, stanford and the others, please give them to the rest of us! https://t.co/E2hET5aCYf

2023-01-07 20:45:50 I think this is the right way to think about innovation in ML models. It’s basically a brute force search through many many options. Statisticians would never have done this. Too empirical for us. https://t.co/mF01u9deaA

2023-01-07 18:15:07 me: can i have some funding for my research so i can disrupt the status quo? them: *holding literally all the money and power under the status quo* no thank you me: them: https://t.co/IYpaNOqpTN

2023-01-07 18:03:44 This is the 1st law of data science. If you are a data scientist, it will be repeatedly relevant throughout your career. https://t.co/jzL5JS3S8L

2023-01-07 17:50:36 RT @kareem_carr: Somebody found what I was talking about!It’s called the China brain (https://t.co/UABKBfmUsm) https://t.co/GT8nXP3vTS

2023-01-07 17:41:48 Update: We have a reference to what I’m talking about in the literature! https://t.co/CUYYHiWDdt

2023-01-07 17:39:39 All thanks go to @strangely_loopy

2023-01-07 17:39:38 Note the Wikipedia article doesn’t say “This is just a rehash of Searle so why are we even talking about this?”

2023-01-07 17:39:37 Somebody found what I was talking about!It’s called the China brain (https://t.co/UABKBfmUsm) https://t.co/GT8nXP3vTS

2023-01-07 17:29:18 @strangely_loopy Yes. This does seem to be talking about the same thing. Thank you for sharing!

2023-01-07 16:19:05 My argument is more subtle. I am setting up an equivalence between something people think is plausible (sentient AI) and something else that intuitively seems absurd (that a group of people can essentially talk a new person into being) and asking which intuition is wrong and why.

2023-01-07 16:19:04 Searle’s argument is more aggressive. He’s trying to refute the possibility that a computer can understand.

2023-01-07 16:19:03 Searle’s argument is about understanding. Mine is about sentience.

2023-01-07 01:14:54 @wtgowers I am more interest in the macro case since the micro seems plausible to many. Does it not seem absurd that human beings could through communicating with each other alone create a sentient being?

2023-01-06 23:03:29 @tgflynn314 I agree. Searle seems to have been concerned with a different problem than I am.

2023-01-06 21:30:21 RT @kareem_carr: Imagine a computer that consists of humans doing calculations at rows of desks and passing pieces of paper between themsel…

2023-01-06 17:59:34 i’m basically a biologist https://t.co/pFEXvlNrb2

2023-01-06 17:06:01 @eef31415 Belief in sentient AI implies that it's at least possible that corporations are literally persons.

2023-01-06 17:05:27 To be clear, the idea that humans passing sheets of paper can bring a global sentient consciousness into being is ridiculous to me but this is basically what people are committing to by saying AIs can be sentient.

2023-01-06 16:54:52 Imagine a computer that consists of humans doing calculations at rows of desks and passing pieces of paper between themselves. Do you believe that an AI simulated using this "hardware" can be sentient? If you think computers can be sentient then you have to say yes to this.

2023-01-06 16:00:23 @MalkaSvei @ThosVarley Is there a practical problem that you are trying to solve (some beings whom you think might currently be suffering or could soon be suffering) or is this an intellectual puzzle for you? If it’s the latter then I don’t understand the reason for the emotional investment.

2023-01-06 15:41:35 @MalkaSvei Can you explain why this bothers you? I don’t understand the emotional investment.

2023-01-06 15:27:00 The most disturbing thing I've seen in these "can AI be sentient?" debates is people saying stuff like "us meat computers need to get used to the idea that we aren't special". That is real psychopath shit. "Humans are just things" is the worst possible take on all this.

2023-01-06 02:27:06 @ylecun The claim I'm skeptical of is "A is inspired by B necessitates that A must share key or defining properties with B". It's possible that A such properties with B but it is also possible it does not.

2023-01-06 01:39:27 People in my mentions keep repeating that artificial neural networks were "inspired" by real neurons. I guess we are basing arguments on "inspiration" now. Are we in art class?

2023-01-05 23:42:05 @nazarre @fchollet Thanks for sharing. These are some great comments!

2023-01-05 22:37:35 I think my thinking overlaps a lot with @fchollet on this issue. I think of the successes of LLMs as telling us about the mathematical structure of the things they’re modeling. It is extremely surprising to me that language can be embedded on a continuous manifold. https://t.co/6LT3uhJ92U

2023-01-05 20:48:42 @tunguz Can't say it better than this: https://t.co/t2StYrQkSM

2023-01-05 20:26:33 @pabnau I find it hard to see how any of that is relevant to the point I'm making which is that it's not different in ontological status from matrix multiplication. Like linear regression and logistic regression are completely different things but also...are they?

2023-01-05 19:29:18 @Sonderance I thought it was a cool name just like everybody else but I see that it's lead to a lot of people thinking they must be just like brains and that's unfortunate.

2023-01-05 18:59:10 Besides, it would be idiotic to have like 10,000 papers co-authored by ChatGPT.

2023-01-05 18:59:09 No. We need to nip this trend of treating AI like people in the bud. The *human* authors should just acknowledge the use of the AI in their methods section. https://t.co/6RxRGS9VFL

2023-01-05 17:22:57 It smuggles in some assumptions that should perhaps be expressed in a much more explicit manner. If I tell you that sentience is just this with much bigger matrices, you should probably be extremely skeptical. https://t.co/XvmgI4RymD

2023-01-05 17:22:56 People find the sentence "my artificial neural network is sentient" vastly more plausible than the sentence "my matrix multiplication algorithm is sentient" even though they are roughly the same claim. This implies to me that "neural network" is bad terminology.

2023-01-05 17:06:24 Peer review is an interesting-ness filter. It's the minimum standard of work needed to merit the attention of other scientists. It does not at all mean the research is true. https://t.co/OCXgdmZWlt

2023-01-05 00:43:48 If you like the look of this graphic, I got it from @ChelseaParlett

2023-01-05 00:09:29 To be clear, I'm not saying there is nothing impressive about deep learning but the thing that's impressive is different from what people seem to think it is.

2023-01-05 00:09:28 In contrast, statistical algorithms like linear regression algorithms are designed to find particular models. e.g. linear models So it makes more sense in stats to identify the algorithm with the kinds of models it finds and to expect those models to have certain properties.

2023-01-05 00:09:27 Algorithms in ML are supposed to be assumption-free meaning they "learn the data". If that's true, then the properties of the models they find should come from data not from the choice of algorithm.

2023-01-05 00:09:26 Hot Take: People seem to think models like GPT and DALL-E imply deep learning models are special but I don't think this makes sense given what people claim about deep learning. Here are my thoughts:

2023-01-04 20:54:09 @ylecun Is this a pipe? https://t.co/GeVRMcd3mZ

2023-01-04 18:09:48 them: “artificial neural networks” me: https://t.co/L5cvEvxR7U

2023-01-04 17:25:32 The next tech bubble is definitely AI. https://t.co/8sNKQa7nS9

2023-01-04 17:02:42 @TheDauphin_ uh… https://t.co/cmdp3gUZkQ

2023-01-04 16:10:41 I don’t know who needs to hear this but this is not a neuron. https://t.co/lUdPiF41XA

2023-01-04 15:27:25 I have people in my mentions telling me artificial neural networks are “biologically inspired” because they learn via stochastic gradient descent. My brothers and sisters in Christ, SGD is just calculus with a little bit of probability theory thrown in.

2023-01-04 14:05:26 @ylecun I mean it’s a “misnomer”. It suggests a direct logical relationship instead of a few cherry-picked analogies. Networks are very common. “Learning” is another misnomer for what we statisticians would call “optimization” or model “fitting” since our models aren’t alive.

2023-01-04 01:21:38 They were just warning that it’s important to not let it go so far that we end up living under a totalitarian government and we don’t. So mission accomplished for now.

2023-01-04 01:21:37 1984: Modifying language to make certain ideas literally unthinkable Fahrenheit 451: Banning books Brave new world: Drugged up and tuned out The point of these books is that these are general social tendencies that you will find in all societies so no surprise they’re in ours. https://t.co/6guXfLR236

2023-01-04 00:32:17 At what point does: - “fitting” a model become “learning” the data? - a model’s “coefficients” become an AI’s “thoughts”? - statistical “modeling” become machine “intelligence”?

2023-01-04 00:30:30 @andrewtanyongyi They are only similar in the sense that a network that processes information is common in nature.

2023-01-03 23:51:08 @juan_dla_mancha My sense of the field is they tended to anthropomorphize their mathematical models in the past and still do.

2023-01-03 23:49:16 @jdegourville I think the point of similarity is pretty weak. Lots of things connect to other things in a networked structure and can be analogized as summarizing input from the nodes that it's connected to. That could be a model for social networks in an organization for instance.

2023-01-03 23:00:42 "Neural networks" is a deceptive term. It implies they have something to do with the brain which they don't. Statisticians would have probably called them something like "hierarchical sigmoidal regression networks" and gotten 5 orders of magnitude less funding as a result.

2023-01-03 16:06:15 Pretending to have read certain books often leads to actually reading them so it’s not all bad. I guess what I’m saying is give the fakers a break. We are all only human.

2023-01-03 16:06:14 People read books in part out of a desire to forge a new self. We communicate who we hope to be through the books we aspire to read. It only seems natural to me that that process would involve a bit of ego.

2023-01-03 16:06:13 People can be completely insufferable about books. This includes both the posting of pretentious reading lists and the mocking of said lists. I think this is normal and mostly harmless.

2023-01-02 04:45:44 RT @kareem_carr: A basic misunderstanding about the difference between medicine and public health is hurting our ability to talk about covi…

2023-01-02 02:32:23 @rasmansa Yeah. He blocked me too without me ever interacting with him.

2023-01-02 01:56:09 The book list is…fine and he’s probably trying to set an example for his followers which is a good thing. They probably haven’t read these either.

2023-01-02 01:56:08 Lex won’t see your sneering (because he’s super sensitive and probably blocked you already) but your normie friend who aspires to read more will. https://t.co/3M821L43Jp

2023-01-01 17:27:26 A group of ten individuals can interact in approximately 50 ways, a hundred individuals in 5000 ways, and a thousand individuals in a whopping half a million ways. Things escalate fast.

2023-01-01 17:27:25 We talk about how masking reduces our personal risks directly. We don't talk about how it also reduces the risk to others and how as these risks decrease it feeds back into our own risks and makes us even safer.

2023-01-01 17:27:24 Medicine focuses on the actions an individual can take to keep themselves healthy. Public Health is a lot broader. It focuses on the actions we can collectively take to improve the health of all. Medicine is part of public health but not all public health is medicine.

2023-01-01 17:27:23 A basic misunderstanding about the difference between medicine and public health is hurting our ability to talk about covid and making us all less safe. https://t.co/6mA8trKb6r

2022-12-31 11:04:37 RT @kareem_carr: As a Biostatistics PhD candidate at the Harvard School of Public Health, I gotta say this: Public health is inherently…

2022-12-31 05:15:44 RT @kareem_carr: As a Biostatistics PhD candidate at the Harvard School of Public Health, I gotta say this: Public health is inherently…

2022-12-30 22:39:48 @gregbillock This is a straw man. Literally nobody believes infinite burden is justified to save one life.

2022-12-30 17:43:15 Sciences like physics give the false impression that all science is impartial, but health sciences aren't. There is near complete consensus that health is good and death is bad. Health sciences are born with an agenda. It's to reduce human suffering. They are not neutral.

2022-12-30 17:43:14 The science of public health is inseparable from politics because it involves trade-offs between human lives and resources. Rich vs poor. Old vs young. Individual autonomy vs group health. Investing in population health vs spending on the many other demands of government.

2022-12-30 17:43:13 As a Biostatistics PhD candidate at the Harvard School of Public Health, I gotta say this: Public health is inherently political https://t.co/JNDSGAUkGC

2022-12-29 18:11:05 too real https://t.co/ZxtMNgzltL

2022-12-29 17:12:54 It's not peer as in social status obviously but the ideal peer review process involves searching the entire planet for the 2-3 people that know the absolute most about a very niche topic and then *they* review the work for publication.

2022-12-29 17:12:53 A lot of people on social media forget about the "peer" part of peer review. There's a reason they didn't just call it "review".

2022-12-28 20:56:45 Hot take: Most questioning of “the science” on social media is just conspiracy theorizing about the ulterior motives of scientists. It’s political accusations under the guise of scientific discourse. https://t.co/Q2lbMww62u

2022-12-28 15:14:27 Statistics vs Data Science https://t.co/rOhkuTXzNI

2022-12-27 20:11:59 RT @kareem_carr: if race science is actual science and not just racism, why are race scientists the only scientists that care about ranking…

2022-12-27 16:41:11 This guy is a real professor by the way with publications in reputable journals.

2022-12-27 16:39:35 if race science is actual science and not just racism, why are race scientists the only scientists that care about ranking the stuff they study? It’s not like there are scientists out there trying to find the best lemur or rank electrons from coolest to least cool. https://t.co/Wn6fLfwwLn

2022-12-26 16:59:52 the big difference between statistics culture and machine learning culture is people who talk like this about their statistics models usually end up in jail. https://t.co/41A7Npwpwy https://t.co/OLV0wH55Wl

2022-12-26 16:19:55 tired: alpha male. beta male. inspired: type I error male. type II error male. https://t.co/N1n4nrd63e

2022-12-26 15:06:13 RT @kareem_carr: People need to be more cautious with this new tech. If ChatGPT made a mistake somewhere in this list of hundreds of citie…

2022-12-26 00:32:33 i know. i know. unlike crypto/nfts/web3, this time it’s real.

2022-12-26 00:29:50 https://t.co/L6CklvUej5 https://t.co/gZ2rGNSeqv

2022-12-25 15:43:34 @amyhsin @dcephilpott I use it to ask stuff like “Do you know any other packages that do what X does?” or “Using X package, write some example code for doing Y.” Helps me learn the syntax faster. I then follow up be reading the actual documentation once I have the gist.

2022-12-25 14:59:53 I use ChatGPT almost daily at this point but only on stuff where doublechecking the output is easy.

2022-12-25 14:59:52 People need to be more cautious with this new tech. If ChatGPT made a mistake somewhere in this list of hundreds of cities, how would he know? And if he has to doublecheck the list, isn’t it better to get the info from a more reliable source in the first place? https://t.co/dyvXoSD7ab

2022-12-24 23:12:23 about to spend the rest of christmas eve grading data science final projects. don’t kink shame me.

2022-12-24 23:09:46 Yes! I have lots of opinions. https://t.co/fD0yPxpwGQ

2022-12-24 21:10:18 I’m concerned that surfacing view counts is going to create the wrong kinds of incentives. Optimizing for likes promotes creating positivity for at least some people. Meanwhile optimizing for views is much easier to do with awful behavior than good.

2022-12-24 17:27:44 Tip #9: Take your time. Picking the right textbook is a big decision. Mastering technical material is a massive investment in time and energy. Investing a week or two in choosing the best book for you is worth it.

2022-12-24 17:27:43 Tip #6: Pick the book that feels easiest to read even if it's not the most extensive. You'll get through it so fast that it won't matter if it's not the most complete. You can always give the other books a try after you have this first one under your belt.

2022-12-24 17:27:42 Tip #5: Skim each candidate book not spending more than 5-10 minutes per book and see how easy it is to get an idea of what they're saying. Some books are written in such an amazingly clear way that you'll start getting the vibe immediately.

2022-12-24 17:27:41 Tip #3: Other people's textbook recommendations can be helpful, but you have to be careful. People will often say a textbook is good because it's the one they learned on. If everybody is saying a textbook is great but it doesn't feel right for you, take that feeling seriously.

2022-12-24 17:27:40 Frustrated with finding good technical materials? Here are 10 tips for choosing the perfect textbook for you:

2022-12-23 16:37:10 my imposter syndrome says no there is not. https://t.co/JPSe0YPyb1

2022-12-23 16:30:37 i just invented a revolutionary new metric for tweets! i call it "hates". basically, hates = views - likes i'm a statistician. i know what i'm doing. we need this asap.

2022-12-23 15:23:53 The new view count feature has taught me that getting a like on Twitter is exactly like giving a talk in front of 100 people and then only one person claps at the end. Thanks. I hate it.

2022-12-23 14:53:40 Tweets should display likes per views. @elonmusk f(x) = likes f´(x) = likes/views The larger likes/views is the “hotter” the tweet.

2022-12-22 23:47:50 the woke mind virus has gone too far https://t.co/P86e2lPymz

2022-12-22 19:17:53 RT @EpiEllie: Happy end of the semester, twitter friends. If you've enjoyed my posts this year, and have the means, I'd appreciate a dona…

2022-12-22 19:11:44 @TheRealAdamG this sounds cool! you willing to help figure this out?

2022-12-22 19:10:02 @MarkJEngleson nope. how does it relate?

2022-12-22 16:48:51 @ResearchChat Same thing happens when you’re doing data analysis. Sometimes unexpected behavior is a bug in your process and sometimes it’s a genuine result. I don’t think this issue is a dealbreaker and we’ll figure it out over time.

2022-12-22 16:45:10 @joonalohtander I think we would need a way to ensure the baseline gpt is as neutral as possible so we can be highly confident that anything our gpt model says is a product of the text we feed in and not outside influences. Chatgpt might not be quite right for that purpose.

2022-12-22 16:16:47 Instead of reading an academic article, what if you could *talk* to it instead? this could be an absolute game changer.

2022-12-22 16:16:46 the typical output of the academic community is a mathematical model or a conclusion of some kind. what if the output was a gpt model or "mind" whose knowledge was hand-curated by me, a trained expert.

2022-12-22 16:16:45 it would be even better if you could train gpt on your personally curated list of academic articles and then actually TALK TO YOUR BIBLIOGRAPHY!

2022-12-22 16:16:44 chatgpt would be 1000x more powerful for me as an academic if i could fine tune it to specific knowledge sources. imagine being able to have a "conversation" with the collected works of shakespeare or darwin.

2022-12-21 18:29:05 @growaunibrow I think of analytics as manipulating numbers in a domain-knowledge driven way. For instance, plotting covid numbers over time and visually scanning for patterns.

2022-12-21 18:10:25 I've started using the term "data sciences" to refer to statistics and statistics-adjacent fields like machine learning and data science.

2022-12-21 16:37:24 machine learning is just linear-ish algebra

2022-12-21 15:49:21 dear god. what have i done.

2022-12-21 15:48:29 Is linear regression a machine learning algorithm? I will abide by the results of this poll.

2022-12-21 15:12:00 If you'd like to follow me on other social networks. mastodon: kareemcarr@mas.to post: kareemcarr instagram: kareemcarr

2022-12-20 23:50:00 We would probably still want the option to be able to poll followers though because that’s useful for creators to get quick feedback from their audiences.

2022-12-20 23:49:59 If Twitter’s going to use polls for governance, here are some notes: - People tend to pick the 1st option. Twitter should randomize the display order of survey options - Polling needs to be unbiased. An option to distribute polls to a representative sample would be crucial https://t.co/IK7TNXYAaQ

2022-12-20 15:50:22 Vox Populi, Vox Dei https://t.co/1IMBb1JyNi

2022-12-19 16:42:24 I don't like new year's resolutions but I often have a "focus" i.e. an area of my life I'd like to work on or be intentional about. Do you have any data science related things (books to read, skills to learn, better work-life balance, etc) that you'd like to focus on next year?

2022-12-19 15:34:38 Is logistic regression a machine learning algorithm? I will abide by the results of this poll.

2022-12-19 15:23:04 i joined m! my username is kareemcarr at mas dot to.

2022-12-18 21:29:43 (on iphone)

2022-12-18 21:29:28 I heard Metatext is the best app for browsing …Is that right?

2022-12-18 19:40:39 *sigh* so which server are data scientists/statisticians joining on the M site? asking for a friend who is me.

2022-12-16 16:30:25 renaming my “location” variable to “assassination_coordinates”

2022-12-16 15:31:10 Not planning on leaving Twitter just yet but I'm hedging my bets.

2022-12-16 15:31:09 I'm kareemcarr (no underscore) on the new social media platform Post. If you're already on there, give me a follow.

2022-12-16 15:20:29 @aetataureate https://t.co/0Cr9ijd9rD

2022-12-16 15:04:36 @JosephJakeKlein In this case, I mean an unacceptably high probability of injury or death as a consequence of particular speech acts. In general, I think psychological safety is an important factor in building productive truth-seeking communities but I don't think Musk is there yet.

2022-12-16 14:51:11 I think Elon is learning albeit slowly. He now realizes speech should be regulated in cases where it directly threatens people's safety. He just needs to generalize this lesson from his family to trans people, people of color, women and other marginalized groups.

2022-12-16 00:18:44 Machine learning is just statistics at scale. Like anything done at scale, it is a highly distorted and barely recognizable version of the original thing.

2022-12-15 22:28:37 Machine learning is statistics if statistics died and then a billionaire technologist brought it back to life in a cybernetically enhanced body with super human abilities and no memories of its past life.

2022-12-14 15:37:30 1. bias variance-tradeoff 2. degrees of freedom 3. central limit theorem (including the kinds of situations where it doesn't apply) https://t.co/xmzAqbguo2

2022-12-14 15:16:00 Statistics is framework for dealing with information overload. It empowers you to convert countless data points into something that's comprehensible to the human mind.

2022-12-13 20:13:07 @yudapearl When I say "statistics", I'm already imagining in my head a future where casual inference and statistics are combined into a single venerable disciple.

2022-12-13 01:43:02 @LucLapierre8 @cubic_logic He's clearly not a Bayesian.

2022-12-12 16:03:44 @cubic_logic Can you give an example of an unquantifiable uncertainty?

2022-12-12 15:33:10 You can divide the universe into things you're certain about and things you aren't. For the things you don't know, statistics gives you a mathematical way of describing that uncertainty. In that sense, statistics is a scientific theory of everything.

2022-12-11 16:19:04 ChatGPT interpolates between human statements that contain knowledge. Sometimes these interpolations correspond with the actual state of the world but sometimes not. When they do it's because they have "borrowed" knowledge contained in the human-generated training data.

2022-12-11 16:19:03 Does ChatGPT "know" things? I don't think it does because it doesn't have a way of checking its statements against the state of the real world.

2022-12-11 01:12:40 @adatta02 I think one needs a broader definition of interpolation here. I am sure there lots of examples of functions that return blocks of text in the training data, and plenty of text about football matches. (I’m not saying the ChatGPT model isn’t impressive though.)

2022-12-11 00:27:03 I think it’s a mistake to think of ChatGPT as “learning” an artistic style or scientific concept the way we do it. Humans would forget the examples but retain the underlying ideas. ChatGPT is a machine. It can simply “remember” every example it’s ever seen and recombine them.

2022-12-11 00:27:02 My sense of ChatGPT is it produces an answer to a question by interpolating between the human answers in its training data. In other words human intelligence is an indispensable ingredient in its own intelligence. No prior human solutions. No intelligence. https://t.co/fm8q7HeCLG

2022-12-08 13:00:00 CAFIAC FIX

2022-12-07 08:00:00 CAFIAC FIX

2022-11-15 15:35:10 RT @EpiEllie: Fuck cancer It took another person from me &

2022-11-13 02:24:59 RT @kareem_carr: “The floor is Microsoft Excel”me, a data scientist: https://t.co/SFBF0cR83C

2022-11-12 15:54:19 smh. this is just like crypto. you know data science is over once the celebs start pushing it. https://t.co/UGSE0E2dfV

2022-11-12 00:13:45 “The floor is Microsoft Excel”me, a data scientist: https://t.co/SFBF0cR83C

2022-11-11 17:14:16 math junkies. high as a kite on set theory. you hate to see it. https://t.co/yJIuQ4Xq52

2022-11-10 16:13:41 @palaeoscientist he blocked me. i guess he took the as sarcastic but it wasn’t. genuinely sad people feel this strongly about what i see as a logical response to a less than ideal situation.

2022-11-10 15:46:17 @EpiEllie @VPrasadMDMPH I hope Vinay will consider bringing you on his podcast. I think it could be a very good discussion.

2022-11-10 15:36:27 @palaeoscientist

2022-11-10 15:35:40 @kamran_hakiman sorry to see you go. it’s not my fault the previous owners sold the platform.

2022-11-10 15:34:17 @marionomics101 it’s an adventure i guess. we are all learning what all of this means.

2022-11-10 15:31:32 @biasedByLogic sorry to see you go. i did not make the decision lightly. ethically speaking i think being on the platform is already participating. most of their revenues are coming from ads not twitter blue ie you and me just being on here.

2022-11-10 15:07:33 also i’m worried about the shadowbanning and it’s worth $8 dollars a month to me to continue to talk about statistics and data science on here.

2022-11-10 15:07:32 just wanted to prove to everybody that despite being a grad student i do actually have $8 https://t.co/iyC7564cGM

2022-11-10 01:08:22 twitter_verified_verified_OFFICIAL.docx

2022-11-09 22:45:45 RT @EpiEllie: Y’all! https://t.co/TvrnGkrR1D https://t.co/AXBrjGD9an

2022-11-09 17:17:48 *stochastic local search has entered the chat* https://t.co/bYZTBiTx6A

2022-11-07 16:02:21 @anthilemoon @TheAnnaGat Thanks Anne Laure. Hope all is well.

2022-11-07 02:57:05 @TheAnnaGat these are cool especially #3. how were they made?

2022-11-07 01:31:38 RT @kareem_carr: @elonmusk What role do you see science twitter playing in this? I think a lot of us, myself included, are hoping for some…

2022-11-07 01:31:29 @elonmusk What role do you see science twitter playing in this? I think a lot of us, myself included, are hoping for some clarity on whether Twitter will continue to be a place that’s conducive to discussing detailed technical knowledge. https://t.co/38BOOP3gwk

2022-11-06 16:57:06 @SquidwGoggles I think exploring Mastodon is prudent. It's a low-cost way of attempting to hedge your bets.

2022-11-06 16:55:09 @PWGTennant I agree. Musk could be doing a lot more to calm people. His frequent and aggressive shifts in direction are a genuine source of uncertainty.

2022-11-06 16:52:06 @KerryChipp I'd be interested to hear how your experience of Twitter has been affected and if it's significantly worse than say any other period in the last two years.

2022-11-06 16:47:54 @estherlimtf Oh yeah for sure! I downloaded mine yesterday.

2022-11-06 16:37:27 What I see a lot of people doing is extrapolating wildly based on very little empirical data on how Twitter might actually be changing, and then panicking over those extrapolations, which realistically may never come to pass.

2022-11-06 16:37:26 My training as a data scientist has taught me to separate what I predict from what I observe. What I can directly observe is my Twitter user experience seems to still be well within the normal range of what it was before Musk took over.

2022-11-06 00:25:10 I have an announcement. Math, which has to date been free to use, will now cost $8 a month.We’ll also be adding cool new features like the Riemann hypothesis is now true for all Math Blue customers.I know this change will upset some users but we need to pay the bills somehow.

2022-11-05 16:50:16 @estherlimtf chat around it. seems to center a lot on the benefits of decentralization.

2022-11-05 15:51:32 not gonna to lie. mastodon is giving me crypto energy.

2022-11-05 00:07:18 Statistics has had a massive drop in usage, due to activist groups pressuring us to call it Machine Learning, even though nothing has changed in the math and we did everything we could to appease the activists.Extremely messed up! They’re trying to destroy regression in America

2022-11-04 00:10:05 Basically we humans almost completely define the parameters of learning task before the AI even starts. It’s like we are a professional exercise coach with a defined work out regimen. The computer just has to show up and follow the plan.

2022-11-04 00:03:40 The current ML algorithms can do amazing things but I don’t think they’re very “intelligent”.If we curate their learning material and precisely define the goal, they can learn.In a human context, we call that “spoon-feeding” and associate it with less skilled students.

2022-11-03 16:30:52 I'm not currently verified and my Twitter experience is fine. My followers know who I am. Nobody is confused.

2022-11-03 16:30:51 I honestly don't think $8 a month is that much. The only problem is I have no idea what I'm paying for.

2022-11-03 15:24:07 statisticians: “that’s theoretically impossible”machine learning: https://t.co/QkKrKHTXnb

2022-11-02 16:29:27 try it out for yourself: https://t.co/xRgsTedaw6

2022-11-02 16:29:26 if you give this AI your twitter handle, it can generate tweets in your style.mind blown. this sounds exactly like me. https://t.co/m9P3DiUuCb

2022-11-02 16:07:00 @sanjanacurtis *chef’s kiss* perfect. no notes.

2022-11-02 15:37:54 you may not like it, but this is what peak data science performance looks like https://t.co/Hh29VAdQYG

2022-11-02 15:03:20 I've lost about 500 followers since elon took over which is wild. Anybody else?

2022-11-02 14:58:05 When you think about charging for verification in terms of the numbers, it makes a lot more sense.If @elonmusk can get just 10% of Twitter (~20 mil users) to pay $8 a month, which seems very doable, that comes out to be $1.9 billion a year.

2022-11-02 01:10:29 Sad to announce that I've been fired from Twitter. I was responsible for the feature where 99.36% of all statistics cited in tweets were completely made up.

2022-11-01 22:30:55 @daphmarts @StanfordBioethx @duanaful That’s awesome. Congrats!

2022-10-29 16:28:23 This platform tends to overreact to change. I’m going to continue having fun talking about statistics and math on Twitter as long as nothing prevents me.

2022-10-27 16:59:16 WE DID IT.Final tally:- A: 24.6%- B: 24.1%- C: 26.7%- D: 24.5%.Not a perfect result but really impressive for a game involving 8000 people. Good job everybody! https://t.co/p4wWVLCfgh

2022-10-26 16:02:36 I just realized I didn’t say this explicitly. Retweeting/Quote-tweeting are also allowed.

2022-10-26 15:41:49 This poll is a cooperative game.Try to get as close to an even split as you can with each option getting 25% of the vote.Rules:- DON'T use any kind of randomization device- DO try to influence potential voters by commenting on how you think they should vote

2022-10-25 15:24:49 hell yeah! https://t.co/wnfOv8o6fs

2022-10-24 02:22:45 RT @kareem_carr: A lot of people think math is never ambiguous which is false.0⁰ is a perfect example of what ambiguity looks like in mat…

2022-10-23 23:01:31 @quaesita look at all those math books! . the chainmail shirt is cool too.

2022-10-23 15:11:45 Reference: the wikipedia article on 0⁰ is very good if you want to read more about it. https://t.co/NeacDAV2ui

2022-10-23 15:11:44 Unfortunately (or fortunately), having a value for 0⁰ is really helpful in a lot cases. Many find it much too convenient to just completely avoid it.

2022-10-23 15:11:43 A lot of people think math is never ambiguous which is false.0⁰ is a perfect example of what ambiguity looks like in mathematics.

2022-10-22 21:16:33 me: how much would it cost for me to train this model?machine learning community: https://t.co/wBQYl2hdEX

2022-10-22 14:28:05 choose your weapon https://t.co/iNE56TGZe0

2022-10-20 23:16:22 correcting my gf’s texts to be more scientifically accurate https://t.co/SHM9ipy1WB

2022-10-20 22:37:04 Surprisingly, this is how data science works.To answer “Did that happen?”, we need to specify:- the events about which the question can even be asked- the measurements to use- the time window to search in- the precise numerical meaning of “happen” https://t.co/tPJKMfgUus

2022-10-20 14:57:09 i had these two software engineer friends that were dating and the guy had this cool idea to propose using githubbut he ended up taking forever and eventually they broke upturns out he was afraid to commit

2022-10-20 14:05:30 solve for β₁, β₀:truss = β₁·scaramucci + β₀ + ε

2022-10-20 13:53:47 It's rational for an academic journal to prioritize the time of their staff (whom they pay) over the time of academics (whom they don't).

2022-10-20 13:53:46 Hot take: The current system for reviewing academic papers could be made much faster and less painful for the academic community, if journals had an incentive to streamline the process, which they don't, since free labor costs the same whether you waste most of it or not.

2022-10-19 23:06:56 I think 90% of the culture war is just America trying to emotionally process this plot. The concentration of wealth among older americans, the anxiety about immigration, “the family”, women’s reproductive choices, “replacement”, it’s all right there. https://t.co/vNsfu9HPm7

2022-10-19 22:49:47 “you can never have too many comments”boyfriend “your code should speak for itself” girlfriend

2022-10-19 18:50:23 @PhDemetri Congrats!

2022-10-19 15:51:34 if you asked this guy if 2+2=4, he’d probably be like depends on what you mean by “two”, “plus”, “equals” and “four”. https://t.co/70R9e0pEoq

2022-10-19 14:31:05 me, a data scientist *remembering i need to validate my models*: hey. i want you to know your feelings are valid and i support your predictions no matter what

2022-10-19 14:15:07 @nkreu113r How does this relate to what I tweeted?

2022-10-19 14:09:34 Language models like GPT-3 make it really obvious that particular writing styles are just subspaces within a larger space of all plausible human utterances.Turns out good writing is actually shape rotation.This is a massive blow to wordcels.

2022-10-19 08:43:21 @yuxiangw_cs Very interesting results. Can you share some intuition for why you focus on parallel neural networks? Does it make the math? If so, how?

2022-10-19 08:16:30 @varingian What am I desperately trying to spin?

2022-10-18 22:54:54 @aburone Yes! GPT-3.

2022-10-18 22:42:54 apparently gpt-3 is better at picking up on nuance than the 200+ people who misinterpreted the original tweet.

2022-10-18 22:38:06 god damn it. did i just get out-written by a computer? what the hell https://t.co/WbXbOh7BE7

2022-10-18 15:20:51 @primalpoly I get your point but you can't expect the whole of Twitter to cater to your personal word usage preferences which come out of your particular lived experiences.That's actually a pretty "woke" way of moving through the world if you think about it.

2022-10-18 14:52:48 I believe that the science community should be large and diverse and that science itself should be used for the benefit of all human beings.It's strange to me that people think this is "woke".The promotion of reason and fraternity dates back to at least the Enlightenment.

2022-10-18 14:14:11 @primalpoly Have you ever seen me use the term “stochastic violence” anywhere or are you projecting a whole political ideology on to me without knowing much about what I actually think?

2022-10-18 13:52:19 @primalpoly “Statistics *does* violence to [abstract concept]” is clearly a figurative use of the word “violence”.

2022-10-18 13:38:51 me *a data scientist on a dating app*: i actually do a bit of modeling

2022-10-17 23:36:22 I’m surprised this is being seen as “woke”.The data-driven technocratic approach is often just as hard on rightwing populations that prioritize individualism since statistics is collective by nature. https://t.co/ezSXDFAWUP

2022-10-16 16:02:46 Statistics does violence to human experience. There is a moral dimension to using it.It reduces our rich, diverse stories to pristine, bloodless observations.So, like a surgeon cutting into a patient, we must be careful to use the violence of statistics to do good.

2022-10-16 14:28:48 Math Lady Hazel is making a math history joke methinks. https://t.co/1cRTWWZPQl https://t.co/fjsl3VRQPl

2022-10-13 23:00:54 @yudapearl @PWGTennant @JohannesTextor Have you gotten a lot of push back from statisticians? I would have assumed they'd be very receptive to your ideas.

2022-10-13 22:21:22 @yudapearl Using simulations to explore casual questions is a much less well-kept secret.

2022-10-13 16:31:03 Want to know one of the best kept secrets in statistics?Programming is a statistics superpower.The ability to write simple computer simulations will allow you to answer a huge number of real-world statistics questions even if you're not that good at math.

2022-10-13 15:43:44 Statistics is a lot more powerful and applies to a lot more situations than people think.Not just randomness but uncertainty.Not just data but information.

2022-10-11 23:00:15 PARENTS, please check your kids candy this Halloween. Just found Bayes’ theorem, a notorious gateway drug to full blown Bayesianism, in this Snickers bar. https://t.co/5YylKPpvO2

2022-10-11 16:43:09 For more information, check out statistics historian Steven Stigler's Book "The Seven Pillars of Statistical Wisdom".

2022-10-11 16:43:08 Residuals: Studying the residual which is the part of the data that the models do not explain as a way of quantifying the uncertainty in those models.

2022-10-11 16:43:07 Design: Constructing studies in ways that maximize the amount of information we receive given the resources expended.

2022-10-07 22:34:17 y’all gonna think i’m joking but this is literally what it said https://t.co/701wCUtTci

2022-10-07 22:34:15 i dropped this tweet into gpt-3 to see what the AI would say. https://t.co/N2D9fvJSyu https://t.co/Z75iruDhu3

2022-10-07 14:00:37 you just need to trick them into thinking it was their idea https://t.co/CYJ7XplWmu

2022-10-06 22:28:13 my neighbor decided to make the undead army of skeletons more scary by adding some math facts https://t.co/a4LF1vbqhM

2022-10-06 16:16:36 philosopher: the only thing i know is that i know nothingdata scientist: the only thing i know is that i know (nothing-1.96σ, nothing+1.96σ)

2022-10-06 16:00:35 @catlaughing I would justify it based on utilitarianism. The decision is affecting a large number of people not just the individual.Then you might ask “who’s to decide when this is ok?”That’s why extensive effort cultivating virtues like prudence would help a lot here!

2022-10-06 15:52:12 @think___y But sometimes those decisions affect just you and sometimes they are on behalf of others. So I would be deciding which way to go based on the scale of the impact of the decision.

2022-10-06 15:44:07 @catlaughing I agree with you that the frameworks are in conflict. I don't think there's any way around that. In practice, I think it might all work like a system of checks and balances. Individuals would and should push back against the state when it encroaches too much.

2022-10-06 15:36:55 @catlaughing Let me give a different version of that. Suppose a madman was about to end life on earth by launching a bunch of nukes unless you were willing to murder 100 innocent people. The choice sucks but it seems pretty obvious what the answer is.

2022-10-06 14:58:15 I think different ethical frameworks make sense at different scales.I favor virtue ethics on the individual level, deontological ethics for communities and utilitarianism for large societies.

2022-10-06 14:32:29 That unemployed friend at 3:00 PM on a Wednesday: https://t.co/00GGUAEg7E

2022-10-05 14:15:07 Tiktok has shown how powerful a platform can be when it supports its creators.I hope Twitter's new leadership will remember that journalists, writers, public intellectuals and academics are creators on Twitter too.

2022-10-05 14:15:06 Twitter is one of the few places where people who spend extensive amounts of time digging into the facts before speaking — let's call them "experts" — can have a substantial social media following despite not being particularly well-connected or photogenic.

2022-10-05 14:15:05 My biggest concern with Elon buying Twitter is I'm not sure he gets how important journalists, writers, public intellectuals and academics are to the legitimacy of the platform.These folks are a big part of why Twitter often drives the news cycle despite being relatively small.

2022-10-04 22:26:15 I used to think of machine learning and statistics as competing approaches to the same subject matter but I no longer see it that way.The question I've been increasingly asking myself is:How can ML tools be used to help statisticians do their jobs better?

2022-10-04 14:12:43 source: https://t.co/eXG4eMAoSKh/t: @MarioKrenn6240

2022-10-04 14:12:42 The level of growth in machine learning is astounding.The number of papers in AI and machine learning *doubles* every two years. https://t.co/x6FU6ZPrZe

2022-10-03 14:15:44 When trying to understand stuff related to probability, do you find explanations that involve dice, cards and gambling helpful?

2022-10-01 21:43:22 me and my dissertation https://t.co/cojQV50FwM

2022-10-01 14:07:41 “The next game will consist of replicating the results from a 10 year old paper using uncommented code written by an unpaid undergraduate research assistant.” https://t.co/nmjSdUXx3V

2022-09-30 23:27:14 that feeling when you want to explain some math but you don’t have anything to write on https://t.co/zwX0ML4c6y

2022-09-30 15:00:19 how do you pronounce “quasi”?

2022-09-30 14:30:44 me and WHO?!?! https://t.co/mosSL2SSAa

2022-09-29 15:22:38 I knew Francis Galton, early statistician and inventor of the concept of statistical correlation, was racist but this is so much worse than I ever imagined https://t.co/cHs2Q6FfI4

2022-09-29 13:55:28 me, *flirting*: what’s your favorite matrix factorization method

2022-09-29 00:58:24 pure math: *beautiful, elegant, stunning, the language of the gods*applied math: https://t.co/q6sMBOYXKu

2022-09-28 16:54:13 He says the women portrayed in “Hidden Figures” were “low-level black mathematicians”.What’s low-level about performing mathematical calculations that were critical to putting human beings on the MOON?!

2022-09-28 16:54:12 “History is constantly being rewritten to magnify beyond all reasonable proportions the contributions of Black Americans.”This is a divisive and socially destructive take that makes us all worse off.https://t.co/JOZ78Nhhgy

2022-09-28 14:01:38 "slut era" i whisper to myself as i repeatedly apply the chain rule

2022-09-22 23:33:39 “How does this look?” https://t.co/3jCC1g1Vjz

2022-09-21 23:56:27 me, training a deep learning model on my 5 year old laptop: https://t.co/sFlN4zEJDN

2022-09-19 19:30:04 @iqmobile Your feedback on my IQ thread was . I like to hear back from expects.

2022-09-18 12:15:02 Uh, I meant "shouldn't be a place where authority is preferred over reason". Looks like everybody got what I was saying. Still.

2022-09-18 00:49:05 To be clear, I don’t think it’s helpful to say “2+2=4” is white supremacy. I don’t know for sure but I suspect what she’s saying is that math class shouldn’t be a place where reason is preferred over authority, and I think we can all support that.

2022-09-18 00:49:04 This is the proof that 1+1=2. The book in which it appears takes about 400 pages to get there. Concepts are not always as simple and obvious as they might first seem. https://t.co/kAowuAZYqN https://t.co/2pFrmqivy4

2022-09-18 00:49:03 I think we can all agree that math class should be a place where kids learn how to use their reason to derive truths from first principles, not a dull recital of sacred truths.Publicly shaming this woman dishonors that respect for reason while claiming to uphold it. https://t.co/prwSTilnxh

2022-09-17 23:47:08 @Aella_Girl Yes! Women and men have equal average IQs essentially by definition. This is one of the many assumptions baked into IQ tests. I wrote a thread on this a while back: https://t.co/6M3jWiOT1K

2022-09-14 23:17:00 she’s a 10 but it’s 10 sigma. she’s out of your league. her p-value is zero.

2022-09-14 16:34:21 me, about 2 hours into debugging the uncommented code i wrote 8 months ago: https://t.co/dbWEdhQx8B

2022-09-13 14:25:12 data science consulting fees if you include pain and suffering: https://t.co/qOusNGQPqZ

2022-09-08 23:25:21 @Bluefishdude I would drop a lot of geometry and calculus, and just keep the parts that are statistics relevant. Like enough integration to understand computation of expectation for instance.

2022-09-08 15:22:19 @hamotsi I hear what you’re saying and I agree aesthetics and curiosity are strong motivators. But I think intrinsic motivators work best when the student is allowed to follow and refine their own sense of taste which may not include what you’re trying to teach them.

2022-09-08 14:01:26 This is why we should make data analysis part of a standard high school education. It's much easier for people to see how they could use data analysis in their daily life. https://t.co/vGzttvbLYT

2022-09-08 12:06:55 RT @kareem_carr: Hard to pick a worse example of "useless" math. Statistics, AI, computer graphics, optimization. This concept is everywher…

2022-09-07 14:27:00 @zippkode They claimed they didn't "use" it which is incorrect. They might not know it but they are definitely making use of it.

2022-09-07 14:22:47 @fertgs It's not just linear regression. The formula mx+b shows up in a lot of unexpected places (like deep learning). https://t.co/zWYBY05jtM

2022-09-07 14:14:59 Hard to pick a worse example of "useless" math. Statistics, AI, computer graphics, optimization. This concept is everywhere.If you rely on products that require people to collect data and compute a trend, then you've relied on this math.You just might not know it. https://t.co/vGzttvcjOr

2022-09-06 20:39:23 @cailinmeister @lastpositivist That sounds . Looking forward to taking a look at it!

2022-09-02 20:12:48 @fchollet As an amateur photographer, this makes sense to me. Much of the art in photography is curation. A lot of technical skill goes into moving the photograph along gradients that seem interesting (enhancing warmth of light or the contrast) but even that supports the general point.

2022-09-01 15:03:36 RT @kareem_carr: Most people think they know what this quote means but they're dead wrong.Read this short thread to find out why: https:/…

2022-09-01 01:18:28 RT @kareem_carr: Most people think they know what this quote means but they're dead wrong.Read this short thread to find out why: https:/…

2022-08-31 16:57:32 @acdickinson_a @DrEricDing https://t.co/MtdB7vEjOQ

2022-08-31 16:26:29 @taylor_dallas Twain popularized it. He claims in the original text that he got it from Disraeli but no evidence has been found that Disraeli actually said it.

2022-08-31 16:11:51 Follow me for more tweets about data science, statistics and their impact on society.

2022-08-31 16:11:50 Twain published the quote about "lies, damn lies and statistics" in 1907.Modern statistics didn’t exist yet.By "statistics", he meant the simple tallies and surveys that governments and businesses have been conducting forever like “9 out of 10 dentists approve of Orbit gum!”

2022-08-31 16:11:49 Most people think they know what this quote means but they're dead wrong.Read this short thread to find out why: https://t.co/jjEWjjYWLp

2022-08-31 14:39:28 @HRJ21 To do that, you basically need to know upfront, before you start, exactly how many hours a project will require and exactly how much value you're delivering to the client.In other words, you'd have to be pretty experienced and probably don't need this thread.

2022-08-31 14:34:09 RT @kareem_carr: Deciding how much to charge for consulting work can be a nightmare.Here are SIX factors I use to help me figure out my h…

2022-08-31 11:04:28 @ryxcommar @PhDemetri Despite your best efforts, you did write something that would be quite informative for people with close to zero knowledge on this topic.

2022-08-31 10:14:42 @timnitGebru @Grady_Booch I’ve found that if you have a decent following and you tweet about the negative impact of tech on minority populations, there’s a good chance Lex has blocked you. I’m blocked as well despite never having interacted with him.

2022-08-30 15:46:13 @zergbane Agreed. The amount of demand for your services is a good indicator of whether you're going too high. But in my experience, charging too little is a much more common problem that charging too much.

2022-08-30 15:42:11 @LauraALibby I guess I didn't say this explicitly in the thread but the idea is to charge different rates based on the task. Estimating from your current pay would only give you a baseline rate.

2022-08-30 15:28:32 @Jimi_Smash I would do free consulting work for groups in cases where I would also give money.So if I’d give money to a cancer charity, I’d also consult for free.If I wouldn’t say yes to a DM asking me for $200, I also wouldn’t say yes to a request for $200 worth of free consulting.

2022-08-30 14:34:15 @gib_hurst Yes. I worked for ~3 years in a team that did statistics and data science consulting.

2022-08-30 14:26:13 Consulting rates:$25/hr: they get what they get$50/hr: just starting out

2022-08-30 14:26:12 LEARNING TIMEWhen you're consulting, your time is the product that people are buying.Charging for learning time is just being honest about what it would take for *you* personally to be part of doing the work they want done.

2022-08-28 18:20:44 @rayjaymay1967 He said that in the late 1800s. Modern statistics didn’t exist yet. He actually meant simple crude rates that governments and businesses have been collecting and sharing forever like “9 out of 10 dentists approve of Orbit gum!”

2022-08-28 16:46:48 MYTH: Statistics assumes everything is normally distributed Statisticians typically translate your data into mathematical constructs that are guaranteed to be normally distributed if certain basic conditions are met.The normality is a consequence of the mathematics.

2022-08-28 16:46:47 MYTH: Statistics is objectiveStatistics helps us reason objectively about our subjective experience.While the process is mathematical, which is as objective as human reasoning gets, due to the subjectivity of the inputs, the outputs are irreducibly subjective as well.

2022-08-28 16:46:46 Statistics is a frequently misunderstood field.Yet billion dollar decisions often ride on getting it right.THREAD: 6 myths about statistics explained in plain English:

2022-08-28 10:03:11 RT @kareem_carr: Four reasons why I think authorship on academic papers should be more like movie credits: https://t.co/a0mjiuqgAy

2022-08-27 15:01:43 @camjpatrick I agree. I was addressing the “real math” part of the statement. Mathematical statistics is by no means all there is to statistics.

2022-08-27 14:36:34 Most math papers are basically "here's some cool math I found" but statisticians will write about math even when it's not cool, because it seems needed for science, which is why some mathematicians don't vibe with us. https://t.co/g77vBdRYPl

2022-08-26 17:53:33 @dziyang OK. But how do you relate trait neuroticism to data analysis?

2022-08-26 17:46:19 @dmi3k I still use p-values regularly. Lots of statisticians do.

2022-08-26 10:36:19 RT @kareem_carr: Four reasons why I think authorship on academic papers should be more like movie credits: https://t.co/a0mjiuqgAy

2022-08-25 22:49:30 @SJB_SynBio @drdevangm I do want to apologize for not adding something like “source: Steven Burgess” to the illustration. It’s good work that really fleshes out the concept for people.

2022-08-25 22:43:49 @SJB_SynBio @drdevangm Thanks. I’ve been thinking about it for at least 3 years. My recollection is I came up with it on my own but honestly I don’t trust myself not to have read it somewhere and forgotten.Google suggests the idea has been around since at least 2007: https://t.co/oDnGdLQmSJ

2022-08-25 22:29:58 @DavidRSt3wart @JakubTomek13 @drdevangm @SJB_SynBio Yeah. I’m not sure where the conflict is here. I’m definitely not claiming credit. I neither need nor want it.

2022-08-25 22:28:20 @JakubTomek13 @DavidRSt3wart @drdevangm @SJB_SynBio To be clear, I wrote the thread from scratch. It’s not based on other resources. The illustration is the only part that I didn’t do myself. (This does not mean I am claiming credit for the idea. I don’t know where I got it from. It’s at least as old as 2007.) https://t.co/kvxDB33LhS

2022-08-25 18:14:37 @herzi38 How do you use it in practice? Do most journals you submit to have a form? Do you put in your paper or supplementary materials?

2022-08-25 17:32:48 @drdevangm @SJB_SynBio An ironic but fair point. Thanks for sharing. He is indeed the source.

2022-08-25 16:41:33 @jmgduarte @UCSanDiego This is amazing! Thanks for sharing.

2022-08-25 15:36:03 Follow me if you're curious about the systems we use to create knowledge.

2022-08-25 15:36:02 Reason 3: It would be fairerCurrently what merits authorship is the subjective opinion of the person or persons in the collaboration with the most power. This kind of setup is easily abused.

2022-08-25 15:36:01 Reason 2: It would surface many of the invisible contributors to science It would give a concrete way to give credit to people who currently get lumped into the "acknowledgements" under the current system, or don't get listed at all, and who are effectively invisible.

2022-08-25 15:36:00 Let me clarify what I mean.What I like about movie credits is that all the ways of contributing to the finished product are standardized, and everybody gets acknowledged explicitly for whatever they did no matter how small.This is missing from academia!

2022-08-25 15:35:59 Four reasons why I think authorship on academic papers should be more like movie credits: https://t.co/a0mjiuqgAy

2022-08-23 16:24:53 RT @EpiEllie: Feeling frustrated hearing “the pandemic is over” while you &

2022-08-20 11:27:05 RT @kareem_carr: Finally tried out the GPT-3 model from OpenAI. The green text is unaltered output generated by the AI. https://t.co/NxveJh…

2022-08-19 14:09:02 @sarahradz_ Maybe you should have let GPT-3 try???

2022-08-19 14:01:37 Not a joke. It’s pretty smart!

2022-08-19 14:01:36 Finally tried out the GPT-3 model from OpenAI. The green text is unaltered output generated by the AI. https://t.co/NxveJhduGn

2022-08-18 15:03:52 What's your best tip for a specific book to read, specific video series to watch, specific course to register for, specific tutorial to follow or some other specific step that someone starting with little to no knowledge can take to get started with data science today?

2022-08-18 08:54:50 RT @kareem_carr: girl, i just need a moment https://t.co/4tFQWJ65tR

2022-08-17 13:46:03 RT @kareem_carr: girl, i just need a moment https://t.co/4tFQWJ65tR

2022-08-17 00:23:55 girl, i just need a moment https://t.co/4tFQWJ65tR

2022-08-16 19:40:45 @filippie509 I’ve been thinking about a way to express this concern as a Twitter thread for a while now without success. Great thread!I disagree that statistics isn’t equipped to address this issue, but I agree that machine learning as currently practiced doesn’t seem to be.

2022-08-16 15:24:02 @ShirleyBWang Congrats!

2022-08-16 08:16:34 @timeseriesdave Sure. Can you give a quick sketch of what gives something “moneyness”?

2022-08-16 08:12:49 RT @kareem_carr: I have a stupid economics question.Imagine a matrix where the ij-th element is the amount of object i that can be exchan…

2022-08-15 20:21:54 @arvizzoni I'm probably the one that's completely off. More just curious to hear how.

2022-08-15 16:02:54 I would suspect that we'd have to somehow weight appropriately for the frequency at which certain exchanges happen since they aren't all equally likely.

2022-08-15 16:00:47 I have a stupid economics question.Imagine a matrix where the ij-th element is the amount of object i that can be exchanged for object j.Can we think of "money" as something like the first principal component of this matrix?

2022-08-15 13:41:42 solidarity https://t.co/8gOHBgpZYp

2022-08-13 16:29:01 the best of both worlds https://t.co/BezC5nHmDr

2022-08-13 15:10:14 What did the British data scientist say when he wanted one more analysis?“Oi mate, can i have ANOVA?”

2022-08-13 09:09:38 @johnquackenbush The trait could be "Being Professor John Quackenbush". Value 1 if you are and 0 if you're not. (More seriously, perhaps some unique de novo mutation with an associated novel phenotype.)

2022-08-13 01:23:28 RT @kareem_carr: Math Question:For any given trait, do half of all people have to be below average?If not, what's the largest percent of…

2022-08-12 15:13:39 (By “average”, I mean the standard definition: the mean value.)

2022-08-12 15:02:19 Math Question:For any given trait, do half of all people have to be below average?If not, what's the largest percent of a population that could theoretically be above the average?

2022-08-12 14:17:23 I tried to impress my calc prof by demonstrating that a constant function’s mean was also its maximum.She said my proof was average at best.

2022-08-12 10:02:43 RT @kareem_carr: I once met a french statistician who made a late career switch to logistics management for a regional seafood wholesaler.…

2022-08-11 14:16:18 I once met a french statistician who made a late career switch to logistics management for a regional seafood wholesaler.She said she was really into the Poisson Distribution.

2022-08-11 10:10:35 @daniela_witten @aristeinberg @COPSSNews @AmstatNews Amazing. Congrats!

2022-08-10 17:06:48 RT @EpiEllie: It's almost Back to School time &

2022-08-10 16:36:13 getting into university now vs when your professors applied to university https://t.co/L6lPkbJnvE

2022-08-10 14:42:06 Update: the account that was spreading racist misinformation about me has been suspendedI didn’t report him but to those that did, thanks for your support. https://t.co/okLPEibPNt

2022-08-10 12:15:23 @statsepi Thanks for saying that, Darren. I really appreciate it.

2022-08-10 12:13:04 @ChelseaParlett Thanks, Chelsea!

2022-08-10 10:14:13 RT @kareem_carr: Feeling a bit bummed out by all the racism I have to deal with on this site.It feels so excessive given I mostly tweet a…

2022-08-09 15:12:10 I should probably ignore it all myself but sometimes it sucks to have to deal with these racially-motivated smears against my competence and character.

2022-08-09 15:12:09 I know some of you will want to go over there and give him a piece of your mind. I don’t advise it.It will only cause the Twitter algorithm to think his account is getting lots of engagement and promote it.

2022-08-09 15:12:08 Take the already tiny odds of getting into Harvard grad school and make it even smaller. That was my situation.What I find most disappointing is out of the dozens of comments and hundreds of likes, I saw only one or two people object to the blatant racism which is sad. https://t.co/t1ZLDaWvjM

2022-08-09 15:12:07 Given that my university has its pick of the best students on the planet, you’d have to think there are no smart black people anywhere in the world for them to have gone with a dude that can’t even count.

2022-08-09 15:12:06 Feeling a bit bummed out by all the racism I have to deal with on this site.It feels so excessive given I mostly tweet about math and statistics not politics.So many people on here can’t see past my skin color.They make up wild statements about me and question my humanity. https://t.co/7VxzWv479P

2022-08-09 14:20:09 https://t.co/I8rGgLKb8q

2022-08-09 14:14:25 RT @EpiEllie: It’s been about a year since I was diagnosed with ADHD &

2022-08-09 13:50:30 @cainwill @blueyeliz @wirelessben Varies but extremely common. Mostly standard at my institution.

2022-08-09 12:56:00 @HenningStrandin Either "2" is about real stuff that exists or it's not and we can go back to equating it to Harry Potter.

2022-08-09 12:55:27 @HenningStrandin So you're admitting to taking your intuitions about maht from physical reality, yes? In that case, must you not also agree that empirical facts about what happens when you aggregate two things with two more things ought to relevant to the notion of "2+2"?

2022-07-27 11:02:18 @HenningStrandin I think they are fully aware that I'm using "+" as a model for a physical process in all my examples. Something they probably do themselves on a daily basis without thinking about it. And the process of discussing it actually makes it clear why that might not be a good idea.

2022-07-27 11:01:02 @HenningStrandin I don't think it's reasonable to use "+" to mean what it might mean to a mathematical formalist when speaking to a general audience. It's basically equivocating since they wouldn't be aware of what that means and will translate it into the idea of "combining".

2022-07-27 10:53:07 @HenningStrandin I also agree that linguistic clarity resolves the problem, but it's noteworthy (to a general audience) that the price of that linguistic clarity is that we are no longer talking about physical reality (which is the motivation for the thread).

2022-07-27 10:41:14 @HenningStrandin I think we're saying the same thing which is that "+" in arithmetic is not the same as physically combining groups of things. Believe it not, this is news to a lot of people who aren't philosophers or mathematicians. Hence the thread.

2022-07-27 10:27:05 @HenningStrandin I get it too, I think. You don't think about math as needing to accurately describe the physical world. Most regular people do though and "combining" is absolutely how they think of addition which is why the thread resonates.

2022-07-27 09:24:41 RT @kareem_carr: In the summer of 2020, I got into a huge internet fight about math.It was such a big controversy that I ended up being p…

2022-07-26 23:39:05 @vinodkhare Numbers arise from record keeping and records were an attempt to keep track of physical things (grain, livestock etc). If you mean that a debt isn’t physical then I guess that’s fair. Although, a debt is related to a physical exchange of goods historically speaking.

2022-07-26 22:23:48 @KaraSunburst People get a little mad at me too.

2022-07-26 22:01:42 RT @DiscoDeerDiary: I fucking love this thread. Provides a comfortable counterpoint to having to hear so many people smugly say "I guess it…

2022-07-26 21:05:20 @llaWttaM I think what you're saying is mathematics as a formal logical system is a way of coming to know real truths about the world. To that, I would say yes, I agree.In the thread, I'm haggling over what kind of truth math is, but I do agree that it's a kind of truth.

2022-07-26 20:15:27 RT @KayedSabrina: This thread is

2022-07-26 16:46:40 RT @think___y: This is the content worth having Twitter account for

2022-07-26 16:24:47 @RhodicMike What "side" am I on?

2022-07-26 16:22:52 RT @jmagnuss: Ah some good brain warmup for Tuesday morning!

2022-07-26 16:22:26 RT @MonicaBaumann: As always, Kareem is a great communicator.

2022-07-26 15:18:51 So, in my opinion, outside of a math classroom, "2+2=4" is a meaningless statement until you tell me what we're actually adding.

2022-07-26 15:18:50 Even a very simple conceptual framework like arithmetic can turn out be a lot more complicated than most of us realize.

2022-07-26 15:18:49 Let's say my goal wasn't to measure the number of people with covid but instead to measure the level of hospital utilization.Now, the same patient visiting two different hospitals should be counted as two separate utilizations of the available healthcare resources.

2022-07-21 13:57:24 sounds like a fun friday night me https://t.co/Ow4jBGWDkm

2022-07-20 06:11:09 RT @kareem_carr: I believe that data illiteracy, the inability to make sense of data, is becoming a huge barrier to human progress.The co…

2022-07-19 23:02:42 *me, reflecting on my dissertation years later*interviewer: “was any of it true?” https://t.co/yL6uKOKx09

2022-07-19 22:54:25 being a grad student, i know this feeling well.

2022-07-19 22:52:57 tfw you’re smart yet stupid https://t.co/XPSYfgLBhP

2022-07-19 16:29:37 @rowyourbot It would be harder to lie to people about what’s in the data if those people were data literate similar to how it’s harder to lie to people about what’s in a book if they can read.

2022-07-19 14:49:36 Advances in tech are making data collection faster and cheaper than ever before.The need to be data savvy is only going to get greater.

2022-07-19 14:49:35 I believe that data illiteracy, the inability to make sense of data, is becoming a huge barrier to human progress.The covid pandemic has been a perfect example of this.

2022-07-19 01:57:22 RT @kareem_carr: I’m sorry but this argument from Elon’s lawyers is borderline statistically illiterate.It doesn’t matter at all that 100…

2022-07-18 22:57:28 Don’t do it! https://t.co/k2I2SIktRP

2022-07-18 15:59:47 science is a harsh mistress https://t.co/IEFf9Qe7m6

2022-07-18 14:18:06 Some questions for people who believe personhood begins at conception:1. If a zygote splits to form twins, did the original person die or are they both the same person?2. If two zygotes fuse to form a chimera, is the new human being two people or did the original two die? https://t.co/VYl6EDHxQz

2022-07-18 13:19:49 A few commenters have brought up that technically N does matter (which is true). The formula pictured below is a little closer to the truth.Assuming Twitter has about 229 million active daily users, this formula is about 99.98% the same as the one I used earlier in the thread. https://t.co/sbN7hWybyI

2022-07-18 13:11:00 @awwscript Fair point. Updated my thread.

2022-07-18 12:44:00 RT @kareem_carr: I’m sorry but this argument from Elon’s lawyers is borderline statistically illiterate.It doesn’t matter at all that 100…

2022-07-17 20:27:07 @steveniweiss I didn’t know this when I wrote the thread but looks like Twitter was thinking about it as 9000 accounts per quarter as well. https://t.co/IWzeYnqEIq

2022-07-17 20:14:40 @TennantRob If you already have ±0.5% accuracy, what would be the logic behind going for more?

2022-07-17 19:15:46 RT @OliPerkins2: Nice thread here on sample size, elon musk and general bs.

2022-07-17 18:01:12 It’s strange that Elon brought up machine learning as a superior option. Obviously the training data for the machine learning algorithms would have to come from humans so human judgement is still the gold standard here.

2022-07-17 18:01:11 We just have to plug in our estimate which for Twitter was p=0.05 or 5%. Plugging that in, we get about ±0.5%. So we can be 95% confident that the true proportion of bots is been 4.5% and 5.5% which is accurate enough for any business relevant decisions.

2022-07-17 18:01:10 You may have noticed the error formula is a bit circular still since it includes p but p is what we’re trying to figure out in the first place. Not a problem.

2022-07-17 18:01:09 The percent of bots on any single day is probably not a business relevant number anyway so the relevant sample size isn’t 100.We probably want the percent of bots over a longer timescale like the last business quarter. At 100 a day, that gives you around 9000 for a quarter.

2022-07-17 18:01:08 The error bound (95% confidence interval) in this situation is approximately 1.96 times the square root of p(1-p)/n where p is the percentage of bots and n is the number of samples.Notice the formula doesn’t involve the total population size at all. https://t.co/cVXIRS0bIP

2022-07-17 18:01:07 I’m sorry but this argument from Elon’s lawyers is borderline statistically illiterate.It doesn’t matter at all that 100 accounts is a small percent of the total user base. https://t.co/cBzexqAPWE

2022-07-17 00:34:09 the best ex is eˣ

2022-07-15 22:26:27 @Piper_O_Brien @mattyglesias Not every joke is about politics. I'm a biostatistics PhD student which is literally the type of statistician that works on clinical trials.

2022-07-15 16:09:14 this is a tough one https://t.co/VnPmjJAxBY

2022-07-15 08:41:59 @evoluminate @SC_Griffith I’m aware. https://t.co/dAbtgoLS2t

2022-07-14 22:22:00 he’s a 10 but the confidence interval is (0,10)

2022-07-14 20:56:12 @mihaisafta_ @SC_Griffith Yes, you can imagine something going wrong but you can also imagine it *not* going wrong. I think that matters. I’m not saying everything imaginable is going to make sense but done consistently (on the level of a physics-like thought experiment) I think you can get sense from it.

2022-07-14 20:41:45 @sleepmancer @kevinlowens @SC_Griffith As long as the rules of your imaginary world were consistent enough, I'm sure those could generalize to other mathematical systems as well. Lol.

2022-07-14 20:38:37 @TheOutsiderHum1 @SC_Griffith I tend to think of such axioms as attempts to get abstractions that capture certain physical intuitions, so I just thought I'd go straight to the source.

2022-07-14 20:33:57 @Yourdadsfriend3 @isbellHFh @SC_Griffith Just imagine a huge pile. I don't think it matters that beyond a certain number of apples, it's all going to look the same. That's actually a plus. Makes it even more plausible to envision!

2022-07-14 20:32:33 @kevinlowens @SC_Griffith By "add", I just mean you walk over and get another apple and then throw it on or near the pile. If there's a situation that comes up that blocks you from doing this, like your bin is full, you can always imagine a counterfactual situation where the bin has room for one more.

2022-07-14 20:16:40 @SC_Griffith (Not saying this because I think I can explain things to *you* by the way given your background. Hoping you will clarify things to *me*. )

2022-07-14 20:12:21 @SC_Griffith Imagine having a pile of n apples. Now imagine going and getting one more apple and adding it to your pile. Now you have a new number of apples which we will call "n+1". It is always possible to imagine getting one more apple and so it's plausible that no largest number exists.

2022-07-14 16:05:00 I think what most people mean by "neurotypical" doesn't really make sense in the sciences. I've noticed for instance that a lot of people who're neurodivergent outside of science are very neurotypical inside of science.

2022-07-14 15:41:09 If lots of packages tend look like but don’t have a inside, then the probability will be high so no reason to suspect there is any to be had.But if it’s very rare for packages to look like that then we know we probably got a real on our hands.

2022-07-14 15:41:08 Lots of folks have trouble remembering what a p-value is so let me explain with the help of a visual aid.The p-value is the probability that this package would be as -shaped (or even more so) if it did *not* contain the thing that we’re all thinking is in there. https://t.co/2gvLKv5Vna

2022-07-14 15:14:50 Thanks for the great responses philosophy Twitter. It was very helpful. https://t.co/50YsLpoXWi

2022-07-14 09:03:12 RT @kareem_carr: One of my goals is to incorporate more philosophy into how I think about statistics. In general, I think scientists *sho…

2022-07-13 21:53:39 Speaking as a non-American, I think America has an excellent healthcare *market* where you can *buy* a lot of cutting edge services not available elsewhere.You can even buy your way to the head of the line which is harder to do in say Canada. https://t.co/g55r8rrNdp

2022-07-13 21:40:04 Something very interesting must have happened in the 1980s. Unclear what. https://t.co/tlOaADkx8B

2022-07-13 21:40:02 The difference is so clear-cut, it’s almost like a law of physics. https://t.co/I9hrHpgBlH

2022-07-13 17:50:38 RT @kareem_carr: One of my goals is to incorporate more philosophy into how I think about statistics. In general, I think scientists *sho…

2022-07-13 09:33:42 @taka_tukka Kind of you to say. Thanks!

2022-07-13 03:12:10 @JoshHochschild Makes a lot of sense. Resembles what I would do when I did statistics consulting.

2022-07-13 02:59:44 @dekkaaah Thanks!

2022-07-13 02:59:26 @GLopezPharmD Thanks!

2022-07-13 02:33:27 RT @kareem_carr: One of my goals is to incorporate more philosophy into how I think about statistics. In general, I think scientists *sho…

2022-07-13 00:54:30 @SaraLUckelman @venite I agree that it’s hard to come up with advice that works for everybody, but I’d definitely answer that question with “google ‘hammer wirecutter’ and see what they recommend”

2022-07-13 00:44:56 @d_malinsky good suggestion. already working through it. https://t.co/uGVHiGbyg1

2022-07-13 00:14:21 @mpsterling I’ve definitely tried articles from the SEP multiple times. It often feels like I’m drinking from a firehose. It’s helpful in the sense that you get a lot of perspectives though.

2022-07-13 00:09:34 @michelnivard A statistician should definitely be able to tell you what the consensus approach to a particular variety of data analysis. They might need specific details about the data to be sure it applies though.

2022-07-13 00:00:29 @taka_tukka That actually seems a very reasonable question to ask a statistician. I’ve answered that question many times. I’ve even answered it here on Twitter: https://t.co/OrD7gBqFar

2022-07-12 23:55:11 RT @ridderjeroen: Philosofriends, discuss this thread by @kareem_carr – it’s a striking observation about interdisciplinary conversations w…

2022-07-12 23:55:08 @ridderjeroen Thanks for the boost and the responses. They were very helpful!

2022-07-12 16:49:24 I want to know how to have *that* kind of conversation with philosophers. How do I do that?

2022-07-12 16:49:23 Imagine asking "What's a good hammer?" and getting:1. a detailed explanation of Aristotle's favorite hammer from 2000 years ago2. A copy of "The Encyclopedia of Hammers"3. instructions on how to handcraft a hammer completely from recycled coconut leaves and conch shells.

2022-07-12 16:49:22 I think how I ideally would want a conversation with a philosopher to go is I ask something like "what is truth?" and they give me a reasonable consensus view with a description of downsides if any and then I turn it into math.It doesn't have to be perfect just good enough.

2022-07-12 16:49:21 One of my goals is to incorporate more philosophy into how I think about statistics. In general, I think scientists *should* work more with philosophers.But I find talking to philosophers can be frustrating so I wanted to outline where I think the frustration comes from.

2022-07-12 14:10:34 people who write their lecture notes in latex https://t.co/PaptY8F1np

2022-07-11 14:27:56 when you love your data but it doesn’t love you back https://t.co/9UoOQxx3mR

2022-07-11 00:48:52 Seems like there might possibly be a similar thing going on in universities as well. I wonder if there's a link between cost inflation and administrator inflation.What are all these administrators administrating? https://t.co/KjRyF1pbdt https://t.co/KYVBgAc4xk

2022-07-10 18:23:05 lost a hypothesis today. data didn’t support it. need to shake it off. got 5 more hours of data cleaning to do. https://t.co/OLukXvHLR1

2022-07-10 16:29:17 @jmccollum27

2022-07-10 16:29:10 @kurosakiaduma Thanks!

2022-07-10 15:33:10 Hey new followers, let me introduce myself. I'm your friendly neighborhood statistician.My viral tweets tend to be me being ridiculous, but I sometimes sneak in a few serious points about how I think statistics can help improve our increasingly data-drenched society.

2022-07-10 04:51:14 @adad8m @JARS3N You should look into tidyverse if you haven't already. They are very much redesigning the syntax of the language.

2022-07-09 23:51:47 I think many non-statisticians are unaware of the most significant difference between R and Python for statistics. I would guess that stats code in R has benefited from millions of hours of attention from thousands of statisticians. Community investment like that is hard to beat.

2022-07-09 21:20:15 @patscli https://t.co/roapbqBJWv

2022-07-09 20:16:48 when they ask you why you made that particular choice for your analysis and you don’t even remember doing that https://t.co/uVLjZgzTdF

2022-07-09 19:21:52 @gyp_casino ¯\_()_/¯ https://t.co/KYDdK6dYrU

2022-07-09 19:19:20 @kneupane I think it's inevitable that Python would win out in the machine learning world. So much of ML is about efficiently grinding through prototypes which is easier to do in a language like Python that has a greater emphasis on the software engineering and DevOps side.

2022-07-09 19:15:54 And before the real R heads chime in about software engineering in R, if you've literally worked for Rstudio then that's cheating.

2022-07-09 19:11:45 When somebody says R is better than Python for data science or vice versa, most of the time, I think that's just an indication of what they spend most of their time on as a data scientist: statistical analysis vs software engineering.

2022-07-09 19:10:29 I don't think either R or Python are clearly best for data science. For the kind of data scientist that's more like a statistician, R is best. If they're more like a software engineer, Python is probably the way to go.

2022-07-08 23:35:51 learning statistics be like https://t.co/sak6uJ17bh

2022-07-08 15:00:46 You’ve heard of position, velocity and acceleration. What about jerk, snap, crackle and pop? https://t.co/GVGL15JPby https://t.co/TyduVVQQwA

2022-07-08 14:23:43 when you’re feelin fancy https://t.co/Nar8ZHtqEb

2022-07-08 01:40:32 me, a grad student, returning from a successful hunt at the weekly seminar https://t.co/H88cHtHQpz

2022-07-08 00:02:41 when you contribute to the paper but don’t get authorship https://t.co/qlfW3WbfXv

2022-07-07 22:17:36 RT @kareem_carr: Nobody will remember:- Your salary- Your fancy title- How ‘busy’ you were- How stressed you were- How many hours you…

2022-07-07 15:52:35 The NYT goes deep down the statistics rabbit hole and I’m here for ithttps://t.co/0jJf5N6GYj https://t.co/POMg3XtaHg

2022-07-07 15:37:25 @PhDemetri Google forms? You can set it to force people to input certain fields , and it should address formatting issues in general.

2022-07-07 15:02:00 I'm don't think it makes sense to call certain kinds of math "easy". You can find arbitrarily hard math problems at every level.Math is often taught as a curated path of solved problems which gives people the mistaken impression that all the problems at prior levels are solved.

2022-07-07 14:12:00 Nobody will remember:- Your salary- Your fancy title- How ‘busy’ you were- How stressed you were- How many hours you worked People will remember: - that you commented your code

2022-07-07 11:11:57 @Undercoverhist Fascinating. Is this also the invention of the word "programming" as in computer programming as well?

2022-07-07 02:06:06 This kind of behavior makes me furious. It’s so shortsighted!When authorities misuse the genetic data that they’ve been entrusted with, it makes people afraid to share their genetic info which slows down research and hurts all of us. https://t.co/qdID8zyLws

2022-07-07 01:43:49 getting it done with a little help from reviewer #2 https://t.co/xHt0umXToM

2022-07-06 21:07:09 @Aella_Girl I think the true value is about 0.16 for all groups with the smaller categories showing more noise (extreme values) due to smaller sample sizes. You can try simulating this in Python to get a sense.You should consider a side-career in data science. I think you’d enjoy it.

2022-07-06 20:58:46 @Aella_Girl I would rescale: 0 for no interest and 1 for anything more. Using your own scale is a bit arbitrary and makes it harder to interpret the average. The evidence here suggests to me these numbers don’t differ at all by race.

2022-07-06 20:24:40 Thanks for the vote of confidence @gib_hurst https://t.co/7zCENgtPq4

2022-07-06 15:15:17 when your model fails to generalize to real world data https://t.co/aoJE9ZazqB

2022-07-06 14:15:13 Nobody will remember:- Your salary- Your fancy title- How ‘busy’ you were- How stressed you were- How many hours you worked People will remember: - That one time you wrote a bunch of code and it ran with zero errors on the first try.

2022-07-06 00:00:08 Statisticians *inventing the technique*: Haha. This is foolproof!Researchers *applying the technique*: https://t.co/kDkqY7XRZr

2022-07-05 20:52:17 RT @EpiEllie: If you like statistics &

2022-07-05 20:08:43 @CGraziul I try. Thanks!!

2022-07-05 15:54:05 when a bayesian and a frequentist meet in person https://t.co/rXknfcEEAp

2022-07-05 14:01:16 woohoo 80k followers! i don't know why so many people are into statistics tweets but I appreciate you all. https://t.co/O4H5sKQJJq

2022-07-04 20:31:27 RT @kareem_carr: her: babe what you thinking aboutme, a data scientist: https://t.co/Zs3T5wkJ8a

2022-07-04 19:30:37 @SMLaughna @EpiEllie True, but I think this gets at a deeper issue that’s not just about executing procedures correctly. https://t.co/YjLRZqXegq

2022-07-04 19:27:06 @MarcSchaefferGD I agree but the arithmetic above is correct, and adding percentages is fine *in the right context*. Statistics is where we learn what mathematical operations to use on our data and in which context.

2022-07-04 18:54:46 We NEED to make statistics a required course in high school. This is what happens when people think data analysis is just taking a bunch of numbers and doing whatever mathematical procedures feel right in the moment.Meaningless number crunching is rampant right now. https://t.co/6N6tFmG2Eo

2022-07-04 14:56:21 Showing my son all the open browser tabs he’ll inherit. https://t.co/enF2CyCrJG

2022-07-04 14:45:48 Are we nearing the peak of the current AI research bubble? https://t.co/KyQ7SGJnPr

2022-07-04 00:42:02 @alsoknownasLJ It’s interesting tech for sure with the potential to be radically transformative socially, but it also attracts massive amounts of wackiness, and it can be hard to tell which is which.

2022-07-04 00:30:57 Genome analysis on the *checks notes* block chain??? https://t.co/Tpa86qZhs2

2022-07-04 00:05:36 i would even settle for the mode tbh https://t.co/rcdIg8kcEB

2022-07-03 20:52:44 @Aella_Girl you forgot 5

2022-07-03 16:28:27 me, a data scientist, when i get a new dataset https://t.co/Qpdt3oMbdw

2022-07-03 15:06:07 her: babe what you thinking aboutme, a data scientist: https://t.co/Zs3T5wkJ8a

2022-07-02 22:44:42 kind of the same but still hits different https://t.co/bFpfnHNhBw

2022-07-02 16:16:37 me, scheduling the 8am class: I’ll just drink plenty of coffeeme, taking the 8am class: https://t.co/UtAzFgx807

2022-07-02 15:37:56 when the difficulty level of the learning material is just right https://t.co/id2ftzTl5I

2022-07-02 14:41:48 BREAKING: The Supreme Court just ruled 6-3 that R programmers have the right to use <

2022-07-02 00:03:01 BREAKING: The Supreme Court just ruled 6-3 that the field of statistics has been found to have violated the free speech rights of researchers.In a stunning reversal of established precedent, ALL p-values now count as statistically significant.

2022-07-01 16:36:35 @ev_bjork I’m as concerned about this as you are. Donate to my campaign fund so we stop this together.

2022-07-01 16:26:29 BREAKING: The Supreme Court just ruled 6-3 that according to the US constitution logistic regression IS machine learning.

2022-07-01 15:09:46 not a lot of people know this, but machine learning is just statistics after you go into settings and enable dark mode https://t.co/lUEYs3NEjC

2022-06-30 22:55:17 that feeling when you’re the only statistician on the team and your collaborators just asked you, “wouldn’t it be easier if we just did a t-test?” https://t.co/UqxfQRl7UH

2022-06-30 20:01:06 @FerkoWithAFada https://t.co/w5v384uE0l

2022-06-30 14:33:00 Question: You train a machine learning model on a massive dataset. The test set performance is amazing.Can you think of situations where the model will still not perform well on real data?Extra credit: Why might it not perform well even on a randomly selected holdout dataset?

2022-06-29 22:59:20 ¯\_()_/¯ https://t.co/V9YXGhju3X

2022-06-29 22:35:54 pelosi really said “i need to speak to your manager” https://t.co/xs9Qkf5cJg

2022-06-29 16:42:04 saw boston university’s new data science building for the first time yesterday if a name change is all it takes for statistics to get this kind of respect, i guess i’m down https://t.co/33Oztkbwmd

2022-06-29 15:44:17 I have a derivatives joke but I don't want to go off on a tangent.

2022-06-27 13:14:59 How many times have you had covid?

2022-06-24 09:34:13 RT @kareem_carr: “I don’t want to go to grad school, but I want more statistics in my life. How can I get this?”I often hear some versio…

2022-06-24 01:07:17 tuition costs, academic publishing, adjuncting, free work for authorship, the allure of tenure. pretty much all of academia. https://t.co/faOWTk0mmL

2022-06-23 15:27:39 still significant baaaaby!!! https://t.co/glgMeo5UYA

2022-06-23 14:59:47 I'd like to thank @AgnesCallard for giving me the idea of crowdsourcing an answer to this question.

2022-06-23 14:41:17 “I don’t want to go to grad school, but I want more statistics in my life. How can I get this?”I often hear some version of this question. Please help me come up with a GREAT answer to it.

2022-06-23 12:16:17 RT @kareem_carr: I’ve been thinking lately that the main difference between Frequentism and Bayesianism is Frequentisms try to model the ev…

2022-06-23 04:44:27 @antipattern @ben11kehoe @lastpositivist Thanks. Will have to check that one out.

2022-06-22 15:27:38 this is basically me writing code every day https://t.co/1b5FiJd9Xq

2022-06-22 15:11:54 The "random columnist" turns out to be a distinguished professor of Psychology at Northeastern with expertise in "Affective Neuroscience, Psychology of Emotion, and Social and Personality Psychology". Seems pretty qualified to me!

2022-06-22 15:11:53 What's the probability that science has advanced in last 150 years? Very high, I would think. The belief that we can never improve on the genius of our forefathers is the exact opposite of what science should be about. https://t.co/u3wZcszhXB

2022-06-22 14:45:22 @itjohnstone I don’t find this critique of frequentism convincing because it seems to also rule out causal reasoning. “Gravity causes objects to fall” is a statement about a class of events but most would also feel comfortable applying it to a single event of an object falling as well.

2022-06-22 14:35:50 The polling firm @YouGov did a poll of 24,000 Americans to create this alignment chart. I love how much effort they put into this lol. https://t.co/JKC3Koy0II

2022-06-22 14:03:02 @itjohnstone That's a fair framing I think, but I would say that "a model of samples of multiple recurring events" is their way of modeling the event.

2022-06-22 13:59:43 @ben11kehoe I think about that all the time! Would love to hear an answer from someone who knows a lot about Bayesian inference as it relates to philosophy of science...maybe @lastpositivist???

2022-06-22 00:19:51 he’s a 10 but only in binary

2022-06-22 00:03:31 He’s an 8 but he thinks R is a real programming language

2022-06-21 23:07:01 Viewed this way, it’s unsurprising that these approaches can be similar but often come out differently since they’re modeling different but related things.

2022-06-21 22:50:12 I’ve been thinking lately that the main difference between Frequentism and Bayesianism is Frequentisms try to model the event while Bayesians try to model a rational observer’s beliefs about the event.

2022-06-21 19:27:00 why do people keep saying this??? my opinion is basically that:— ml and stats are different activities— sometimes people in ml anthropomorphize their work which is funny to me— sometimes they exaggerate how well things workhow is this anti ml? lol. https://t.co/IeoMWkZ5TG

2022-06-21 15:29:15 In addition, the variable could genuinely be purely bimodal with no underlying structure since nature doesn’t owe us simplicity https://t.co/kRDFw2PKS7

2022-06-21 15:29:13 The other side points out that even so, there will inevitably be downsides, like false positives, false negatives and legitimately tough calls for stuff that falls in the middle. https://t.co/ukjP8gIUW2

2022-06-21 15:29:12 One side will argue that you can imagine something like this where the original data is interpreted as coming from a mixture https://t.co/HGlMxTi2JZ

2022-06-21 15:29:11 as a statistician, it’s amazing to me how many stupid culture war debates come down to arguing about whether this kind of variable is best described as BIMODAL (tending to produce data with two humps) vs BINOMIAL (tending to take on two values potentially with some randomness). https://t.co/uzAfFKMBAk

2022-06-21 14:13:55 4 statisticians and a data scientist https://t.co/KlGkrUcCLR

2022-06-21 11:49:27 RT @kareem_carr: this was my multiverse of madness https://t.co/d7bDFWC0G5

2022-06-21 03:41:53 @78384986xx this is an interactive site shared by another commentor: https://t.co/6h9BYQiZoM

2022-06-21 03:39:43 @natureguidesbc https://t.co/g5M75nkS9k

2022-06-21 03:39:10 @rumianreza Yes!

2022-06-21 03:38:52 @MEPjoe This explains it better. But basically it’s because the relationship happens as part of a limiting process. https://t.co/yZnTnZP4zt

2022-06-21 03:36:57 @comp_phys_marc Of course, buddy! https://t.co/472obHxl5a

2022-06-21 01:02:55 RT @kareem_carr: this was my multiverse of madness https://t.co/d7bDFWC0G5

2022-06-20 21:51:18 @MFumanelli Nice!

2022-06-20 19:30:51 @KakatiTweets Most people learn about the Normal distribution or the Binomial and build out their knowledge from there. The Exponential distribution is also surprisingly central.This little triangle between the Normal, Poisson and Binomial is one of the first relationships students learn. https://t.co/lwvAFEnY9I

2022-06-20 19:21:32 @savethebees93 That’s a good idea although it would probably take a whole series.

2022-06-20 15:01:23 this was my multiverse of madness https://t.co/d7bDFWC0G5

2022-06-19 16:37:01 the academic urge to collect credentials like infinity stones

2022-06-19 15:55:57 how to have fun as an academic https://t.co/vVcxW2Lhl8

2022-06-19 15:18:49 “There are three kinds of lies: Lies, Damned Lies and Statistics. Ha ha ha.”Just stop. When Mark Twain first said it, like 99.9% * of stats hadn’t been invented yet. This is like judging medicine now based on medicine 100 years ago.———* warning: may not be entirely accurate

2022-06-19 08:25:48 RT @kareem_carr: my best work ever https://t.co/zoR2pxF7Cc

2022-06-18 15:55:04 the comic teaches us a valuable lesson that pie charts are terrible https://t.co/P3fq5tRi0M

2022-06-18 05:16:30 RT @kareem_carr: my best work ever https://t.co/zoR2pxF7Cc

2022-06-18 03:35:12 RT @seldo: Holy shit the accuracy.

2022-06-18 00:53:57 yeah, sex is cool, but have you ever spent your friday night building yourself a new desktop. this motherboard lookin thicc af

2022-06-18 00:14:11 @Tim_Hua_ just jokes friend

2022-06-18 00:05:26 @dustdevildeity Sure. A few examples are logical reasoning, imagination, intuition, faith (for instance in moral matters), memory, science, emotion, the five senses.

2022-06-17 23:22:39 @ylecun @olujoe_1 @Twitter Weirdly, they seem to not have an scientist category, but I think you would qualify under their influencer category.

2022-06-17 23:04:35 I don't think it's plausible that some groups of people have ways of knowing that are unique to them, but of the ways of knowing that are available to all of us, some groups might prioritize certain ways of knowing over others in a way that might be characteristic of their group.

2022-06-17 23:04:34 There's a way in which "other ways of knowing" discourse makes some sense to me.

2022-06-17 21:14:53 @kneupane @ylecun smart move imo

2022-06-17 21:12:44 @Night_Burner_ @ylecun I feel like it might be more fair (to me) to read the whole thread and look up some of the information yourself before commenting. Have a good day.

2022-06-17 21:04:54 @ylecun To be clear, this is not a serious post but I do mention the grad school/undergrad aspect in the thread. Wikipedia lists MS as the highest degree for Brin and Page. Not saying that’s correct but the wiki page authors probably put more effort into this than I did.

2022-06-17 15:15:14 my best work ever https://t.co/zoR2pxF7Cc

2022-06-17 11:53:12 @AndrewLBeam @2plus2make5 To be clear, usually you're the one doing me a favor so that's not how I feel about it.

2022-06-17 11:52:03 @AndrewLBeam @2plus2make5 I think some people are bothered by the effort imbalance. One side just sends a quick email while the other side has to go dig around the calendar for available spots. The traditional way takes longer but is equally tedious for both sides.

2022-06-17 01:31:40 they seem very normal

2022-06-17 01:31:39 me, a statistician: my politics is whatever this is https://t.co/z4QSMNZtX6

2022-06-16 23:24:50 @aetataureate I’m actually feeling more positive about it lol. Like I used to want to make it more like statistics and now I’m happy with it just being its own thing.

2022-06-16 23:10:42 choose your own adventure https://t.co/6RnMfMvQjR

2022-06-16 16:51:07 Link to source [pdf]: https://t.co/42ONgn42C9

2022-06-16 16:51:06 Race is a social construct not a biological one. https://t.co/mExhwv7gFi

2022-06-16 15:15:00 Data science is handcrafted statistics, statistics is artisanal machine learning, and machine learning is organically grown, locally sourced linear algebra.

2022-06-16 14:33:00 Did you hear about these two guys who built a boat so they could go on cruises and stuff with all their friends, but then one of the guys went and sank it for absolutely no reason?He completely destroyed their friend ship.

2022-06-16 11:49:17 @wtgowers Yikes! Hope the lack/mildness of symptoms continues.

2022-06-16 04:04:33 @kernlfunction This article gives a general overview. https://t.co/vgSE5BweV8

2022-06-16 01:28:03 suck it google. achieved sentience on my first try. https://t.co/EdciRIALcB

2022-06-16 00:46:45 gonna tell my kids this was machine learning https://t.co/HCwlubsikz

2022-06-16 00:05:46 To be clear, in my professional capacity as a statistician, I would just like to reassure everyone that absolutely no science was used in the creation of this thread.

2022-06-15 23:46:56 I don’t know what to make of this pattern but the Stanford ones all dropped out of graduate degrees and the Harvard ones all dropped out of undergrad.

2022-06-15 23:46:55 Did some idle research yesterday.I noticed some weird patterns like out of the 15 richest billionaires, 9 dropped out of college, usually Harvard or Stanford. https://t.co/9Ss83dJvWl

2022-06-15 16:47:04 scientist: every science uses data so really all science is data science. haha. *monkey's paw curls*

2022-06-15 15:43:00 the day everything changed https://t.co/3RNZqa83Wv

2022-06-15 14:45:53 GAMs are ML now??? OK I give up. Statistics is fake news. It’s all Machine Learning. https://t.co/U2QG1Pvh46

2022-06-15 14:02:00 We need a government funded supercomputing infrastructure specifically for enabling the machine learning community to create open access/open data versions of models like GPT-3 and DALLE-2. The advantage that covert actors have in this space is a risk to all of humanity.

2022-06-14 15:14:21 @valanchee That seems fair to me. ML very much embraces the idea of the black box and the fact that you can just use the data to tell you whatever you want to know about the model (for instance how well it performs).

2022-06-14 14:58:31 I think of statistics as a way of scientifically analyzing the properties of black boxes. It's useful in analyzing situations where you can see some of the inputs and outputs of a system, but don't fully understand the processes involved.

2022-06-14 13:54:21 emeritus professors in the front row be like https://t.co/2UWHVcfEEC

2022-06-14 00:23:29 I call this one “The Art of Programming” https://t.co/rx7o6p4GpM

2022-06-13 16:03:00 One day somebody is going to claim their talking sex bot is sentient and all hell is going to break lose.

2022-06-13 15:35:11 they get it from us (literally) https://t.co/HRtYNz8Hji

2022-06-13 15:01:00 I'm willing to bet real money that there are literally hundreds of stories about sentient artificial intelligences in the training data of that Google AI.

2022-06-13 14:13:21 https://t.co/Pr278FAc0t

2022-06-13 01:38:15 men will literally claim a chatbot is sentient instead of going to therapy.

2022-06-12 19:21:21 me, trying to get published in the washington post: “is harry potter sentient?”

2022-06-12 14:52:57 Can you imagine what bad actors in either the public or private sector could do with access to an army of virtual people that humans were strongly inclined to love, trust and obey as if they were real people?

2022-06-12 14:52:56 It would be extremely silly for us to fall for our own fake data coming out of our fake data machines. We've made machines that can convincingly fake evidence of an inner life. We shouldn't mistake that for an actual inner life.

2022-06-12 14:52:55 Similarly, an ability to converse would normally be strong evidence that we are talking to a person of some kind. Now that we've made AI specifically designed to produce realistic conversations, we can no longer take such conversations as evidence of an inner life.

2022-06-12 14:52:54 The conversations I produce are just the *evidence* of my inner life. Those conversations aren't my inner life itself.I would continue to have an inner life even if I chose to never speak to another person again.

2022-06-12 14:52:53 Seems pretty idiotic to me to define sentience as "being capable of having a conversation". I would define sentience as something more like having some kind of inner life. https://t.co/3x4rLGuv7X

2022-06-11 23:13:04 stop crying https://t.co/KTILcBE0rs

2022-06-11 22:28:34 The biggest impact deep learning has had on me intellectually is it has radically expanded my conception of a mathematical function. I see everything else about deep learning as just a means to an end: a path to all these crazy mathematical spaces where our language and art live.

2022-06-11 14:43:51 me, an expert, skillfully debugging my code https://t.co/EQtZtsihbM

2022-06-10 23:25:22 “the proof is left as an exercise for the reader”the proof: https://t.co/bOqLRnZZKE

2022-06-10 15:07:14 they all post like three preprints a day https://t.co/ch8taPhuQg

2022-06-09 18:54:40 https://t.co/gDGUbe739P

2022-06-09 18:54:18 https://t.co/o4uhvKKU8z

2022-06-09 18:53:50 I would recommended "Chaos: A Very Short Introduction" if you don't want to deal with too much math and "Nonlinear Dynamics and Chaos" by @stevenstrogatz if you're open to some light calculus. https://t.co/6bUzLmTEkx

2022-06-08 20:09:58 @JSEllenberg I think the classic comparison that economists use to talk about human irrationality is cutting your own hair to save $5 but not being willing to cut your neighbor’s hair to make $5.

2022-06-08 20:06:49 @JSEllenberg These are different even from an economic perspective. The second one has an extra opportunity cost of not being able to do whatever you were gonna do instead. Maybe you have some other opportunity that would make you $5000. To summarize:(a) $500(b) -$500+your time

2022-06-08 19:52:18 @lastpositivist Ultimately this might come down to who’s involved but it seems to me that the tech ethics community actually presents a legal risk to tech companies by exposing potential areas of harm that are currently unknown to the public.

2022-06-08 19:50:01 @lastpositivist Interesting observation. In my experience, bioethics guidelines feel much like HR guidelines as in “this is what we all need to do to avoid getting sued”. Much like HR, I’ve found that bioethics groups are often in a protective role vis a vis the organization.

2022-06-08 14:54:38 @doctorBaytas Good question. I think they'll be a need for content distribution platforms that you can trust as well.

2022-06-08 14:43:21 For instance, we might not be able to trust a video, but we might trust our favorite journalist who says they were a direct witness or we might trust them when they say they confirmed the event through multiple sources that they trust.

2022-06-08 14:43:20 Advances in AI will soon give us all the power to create extremely realistic fake visual, audio and text evidence, but I don't think this means we'll never believe anything again. I think we'll adapt by focusing less on the "facts" and more on reputation of who is sharing them.

2022-06-08 14:15:35 @dankettercello watermark from the app i used to make the image maybe.

2022-06-08 14:07:27 my favorite fact about python is if you type “import this”, the response is a super emo poem about coding in python https://t.co/aGadtN89mW

2022-06-08 00:26:20 I think the main take away from the DALLE-2 secret language fiasco is claims about machine learning models should be treated as *scientific* claims, which require properly designed experiments and statistical analyses as proof, so we avoid confusing correlation for causation.

2022-06-08 00:17:14 @nsaphra Nope. Thanks. Will check it out.

2022-06-07 23:42:15 seems to me you could interpret that as “sex is real” but also as “sex is not real”. sometimes mathematical realities are hard to translate into english.

2022-06-07 23:42:14 somebody asked me what i think biological sex is mathematically. i’m no expert but i picture something like a dynamical system with two attractors which we call “male” and “female”.i see the attractors as an emergent property of a *group* of body plans vs specific individuals https://t.co/0KA2y1Dbqk

2022-06-06 09:59:48 @skdh Hope you feel better soon!

2022-06-05 21:49:03 No more math culture war for me, @BretWeinstein. Truce! https://t.co/KVa7kkq9Il

2022-06-05 19:30:50 Obviously ignoring the full complexity of a situation is simpler for the mathematical modeler, but this simplicity often comes at the expense of the people not included in the model.

2022-06-05 19:30:49 Hey. Statistician here. Putting aside the issue of social construction for a second, the variable depicted here is not binary. It literally does not have just two values. https://t.co/CLINZXpVP7

2022-06-05 14:44:00 the data science hierarchy of needs https://t.co/YE3HJKhxar

2022-06-05 00:38:26 @MrHonner @harald_bohr Nice! Here’s another thing I did that’s related. Also when the exponent p=∞, the solution is the midrange i.e. (min x + max x)/2 https://t.co/y22kcUfZ9Z

2022-06-04 15:32:54 Make a data scientist cry with just 4 words

2022-06-04 10:55:50 @CleoHariMSc It’s in my tweet history somewhere. I didn’t delete it but the video link wasn’t working last time I saw it.

2022-06-03 23:45:59 predictionsp-valuesposterior distributions https://t.co/zCsULGvPSF

2022-06-03 23:17:03 I like to think of my simplified explanations as the solution to an optimization problem. They’re the closest low-dimensional approximation of the original idea. https://t.co/EH5rlw71X2

2022-06-03 00:02:06 Y’all laughing but I bet they get the census done real fast in Sweden https://t.co/82DHehS2S4

2022-06-02 00:32:59 controversial opinion: you don’t have to be bayesian to use bayes’ theorem

2022-06-01 23:33:49 I’ve encountered three definitions of “data scientist”:1. people who collect, curate and visually summarize massive datasets2. people who apply machine learning to real world data3. statisticians who focus on user friendlinessWhich one most closely fits your definition?

2022-05-26 13:32:01 RT @EpiEllie: Why doesn’t “individual rights” ever mean the right to a safe &

2022-05-25 05:41:49 RT @kareem_carr: The 3 most important statistical averages seem a bit arbitraryThe mode is the most common data point. The median divides…

2022-05-24 15:39:51 Each average minimizes a different kind of distance to the data.

2022-05-24 15:39:50 The 3 most important statistical averages seem a bit arbitraryThe mode is the most common data point. The median divides the data in half. The mean is the sum of the data divided by the number of data pointsBUT there's a way to looking at them that ties it all together https://t.co/ynofuYHrom

2022-05-24 15:36:07 @Aella_Girl I think you might enjoy this article by renowned epidemiologist @ken_rothman:https://t.co/GPKCfVhHYz

2022-05-24 15:27:58 @miclugo @docmilanfar I like this point a lot! Reframing it as a kind of linearization both makes it easier to see how they would have come up with it while simultaneously emphasizing that the insight is related to calculus (approximation through linearization).

2022-05-23 23:51:04 The source is a short and very charming paper by @docmilanfar where he talks about learning the formula from his father who learned it from his father. https://t.co/aupCVzoDKR

2022-05-23 23:51:03 The Persian folk formula is surprisingly good! For a $10,000 loan at 7% interest for 4 years, the exact formula gives a monthly payment of $239.46 and the folk formula gives an estimate of $237.50

2022-05-23 23:51:02 Persian merchants use a formula for approximating loan payments that only seems to make sense if you know calculus.And yet, the Persian folk formula seems to predate the discovery of the ideas from calculus that are needed to understand it. https://t.co/J4oR2ONBpG

2022-05-23 16:11:58 Just realized “back story” should be “backstory”. Just in case this bothers you as much as it bothers me, here’s the correction. https://t.co/mFsBaHSnLC

2022-05-23 15:46:00 This infographic is adapted from scholarship on the “total survey error” framework. Although originally developed for thinking about surveys, I find it a very useful framework for thinking critically about statistics in general.

2022-05-23 15:45:59 Statistics like “40% of patients experience severe illness” seem very straightforward and easy to understand, but this simplicity is deceptive.Here’s a taste of what goes into creating them: https://t.co/boNF0dSknS

2022-05-22 14:06:17 @natched I don’t think that’s a real definition. It doesn’t “mean” anything. It’s more like a specification.

2022-05-22 13:47:32 For statisticians, I’d say the word is “probability”. We just focus on the math and try not to fight about it. https://t.co/BQXn0EbGIO

2022-05-22 13:44:43 @angie_rasmussen @halvorz For virologists, I’d say the word is “alive” as in “is a virus alive or dead?”

2022-05-21 23:03:35 Even before deep learning started working amazingly well, it already had two things going for it. It had a very cool name *and* there was something super satisfying about getting to use the chain rule that many times in a row. It was like getting a streak of green traffic lights.

2022-05-20 14:57:50 Conversations that start with an invalidation of another group's feelings and experiences, even if factually correct, are going to be a non-starters.

2022-05-20 14:57:49 I don't have any solutions but I have some intuitions.

2022-05-20 14:57:48 What's fascinating to me is so many of us are feeling the same way and yet that unified feeling hasn't brought us closer together.

2022-05-20 14:57:47 No matter how privileged a group might appear to people on the outside, there seem to be people within that group that feel under siege.

2022-05-20 14:57:46 Many groups feel persecuted for just living their life and being who they are: men, women, black people, white people, tenured ivy league professors, billionaires. (1/n)

2022-05-20 12:54:49 RT @kareem_carr: If I had lots of cash, I’d be much more drawn to the technical problem of how to design social systems where people felt i…

2022-05-20 08:11:00 CAFIAC FIX

2022-10-26 16:02:36 I just realized I didn’t say this explicitly. Retweeting/Quote-tweeting are also allowed.

2022-10-25 15:24:49 hell yeah! https://t.co/wnfOv8o6fs

2022-10-24 02:22:45 RT @kareem_carr: A lot of people think math is never ambiguous which is false.0⁰ is a perfect example of what ambiguity looks like in mat…

2022-10-23 23:01:31 @quaesita look at all those math books! . the chainmail shirt is cool too.

2022-10-23 15:11:45 Reference: the wikipedia article on 0⁰ is very good if you want to read more about it. https://t.co/NeacDAV2ui

2022-10-23 15:11:44 Unfortunately (or fortunately), having a value for 0⁰ is really helpful in a lot cases. Many find it much too convenient to just completely avoid it.

2022-10-23 15:11:43 A lot of people think math is never ambiguous which is false.0⁰ is a perfect example of what ambiguity looks like in mathematics.

2022-10-22 21:16:33 me: how much would it cost for me to train this model?machine learning community: https://t.co/wBQYl2hdEX

2022-10-22 14:28:05 choose your weapon https://t.co/iNE56TGZe0

2022-10-26 16:02:36 I just realized I didn’t say this explicitly. Retweeting/Quote-tweeting are also allowed.

2022-10-25 15:24:49 hell yeah! https://t.co/wnfOv8o6fs

2022-10-24 02:22:45 RT @kareem_carr: A lot of people think math is never ambiguous which is false.0⁰ is a perfect example of what ambiguity looks like in mat…

2022-10-23 23:01:31 @quaesita look at all those math books! . the chainmail shirt is cool too.

2022-10-23 15:11:45 Reference: the wikipedia article on 0⁰ is very good if you want to read more about it. https://t.co/NeacDAV2ui

2022-10-23 15:11:44 Unfortunately (or fortunately), having a value for 0⁰ is really helpful in a lot cases. Many find it much too convenient to just completely avoid it.

2022-10-23 15:11:43 A lot of people think math is never ambiguous which is false.0⁰ is a perfect example of what ambiguity looks like in mathematics.

2022-10-22 21:16:33 me: how much would it cost for me to train this model?machine learning community: https://t.co/wBQYl2hdEX

2022-10-22 14:28:05 choose your weapon https://t.co/iNE56TGZe0

2022-10-26 16:02:36 I just realized I didn’t say this explicitly. Retweeting/Quote-tweeting are also allowed.

2022-10-25 15:24:49 hell yeah! https://t.co/wnfOv8o6fs

2022-10-24 02:22:45 RT @kareem_carr: A lot of people think math is never ambiguous which is false.0⁰ is a perfect example of what ambiguity looks like in mat…

2022-10-23 23:01:31 @quaesita look at all those math books! . the chainmail shirt is cool too.

2022-10-23 15:11:45 Reference: the wikipedia article on 0⁰ is very good if you want to read more about it. https://t.co/NeacDAV2ui

2022-10-23 15:11:44 Unfortunately (or fortunately), having a value for 0⁰ is really helpful in a lot cases. Many find it much too convenient to just completely avoid it.

2022-10-23 15:11:43 A lot of people think math is never ambiguous which is false.0⁰ is a perfect example of what ambiguity looks like in mathematics.

2022-10-22 21:16:33 me: how much would it cost for me to train this model?machine learning community: https://t.co/wBQYl2hdEX

2022-10-22 14:28:05 choose your weapon https://t.co/iNE56TGZe0

2022-10-29 16:28:23 This platform tends to overreact to change. I’m going to continue having fun talking about statistics and math on Twitter as long as nothing prevents me.

2022-10-26 16:02:36 I just realized I didn’t say this explicitly. Retweeting/Quote-tweeting are also allowed.

2022-10-25 15:24:49 hell yeah! https://t.co/wnfOv8o6fs

2022-10-24 02:22:45 RT @kareem_carr: A lot of people think math is never ambiguous which is false.0⁰ is a perfect example of what ambiguity looks like in mat…

2022-10-23 23:01:31 @quaesita look at all those math books! . the chainmail shirt is cool too.

2022-10-23 15:11:45 Reference: the wikipedia article on 0⁰ is very good if you want to read more about it. https://t.co/NeacDAV2ui

2022-10-23 15:11:44 Unfortunately (or fortunately), having a value for 0⁰ is really helpful in a lot cases. Many find it much too convenient to just completely avoid it.

2022-10-23 15:11:43 A lot of people think math is never ambiguous which is false.0⁰ is a perfect example of what ambiguity looks like in mathematics.

2022-10-22 21:16:33 me: how much would it cost for me to train this model?machine learning community: https://t.co/wBQYl2hdEX

2022-10-22 14:28:05 choose your weapon https://t.co/iNE56TGZe0

2022-10-29 16:28:23 This platform tends to overreact to change. I’m going to continue having fun talking about statistics and math on Twitter as long as nothing prevents me.

2022-10-26 16:02:36 I just realized I didn’t say this explicitly. Retweeting/Quote-tweeting are also allowed.

2022-10-25 15:24:49 hell yeah! https://t.co/wnfOv8o6fs

2022-10-24 02:22:45 RT @kareem_carr: A lot of people think math is never ambiguous which is false.0⁰ is a perfect example of what ambiguity looks like in mat…

2022-10-23 23:01:31 @quaesita look at all those math books! . the chainmail shirt is cool too.

2022-10-23 15:11:45 Reference: the wikipedia article on 0⁰ is very good if you want to read more about it. https://t.co/NeacDAV2ui

2022-10-23 15:11:44 Unfortunately (or fortunately), having a value for 0⁰ is really helpful in a lot cases. Many find it much too convenient to just completely avoid it.

2022-10-23 15:11:43 A lot of people think math is never ambiguous which is false.0⁰ is a perfect example of what ambiguity looks like in mathematics.

2022-10-22 21:16:33 me: how much would it cost for me to train this model?machine learning community: https://t.co/wBQYl2hdEX

2022-10-22 14:28:05 choose your weapon https://t.co/iNE56TGZe0

2022-11-17 15:54:06 no data scientist has mastered all 5: 1. R 2. Python 3. javascript 4. making eye contact 5. Microsoft Excel

2022-11-17 14:58:40 coding is lowkey goated in situations where feeling like an idiot for multiple hours is the vibe.

2022-11-16 15:39:00 The larger the machine learning models, the more we will need statistics to understand them.

2022-11-16 15:38:59 Machine learning vs statistics is a false dichotomy. The most successful machine learning models are vast collections of numbers whose relationship to the model’s behavior are poorly understood. This is itself a statistics problem.

2022-11-16 01:11:35 which one are you?

2022-11-17 15:54:06 no data scientist has mastered all 5: 1. R 2. Python 3. javascript 4. making eye contact 5. Microsoft Excel

2022-11-17 14:58:40 coding is lowkey goated in situations where feeling like an idiot for multiple hours is the vibe.

2022-11-16 15:39:00 The larger the machine learning models, the more we will need statistics to understand them.

2022-11-16 01:11:35 which one are you?

2022-11-18 23:39:13 *notices Taylor Swift is trending* Have you heard about Taylor’s new series? . . . . . . . . It’s very derivative.

2022-11-18 23:15:13 @DavidSabatini2 To be fair, it’s not that great on twitter either. I think math needs a visual format like a video but also a medium where you can give students feedback. The idea of the substack would be to talk about stuff besides the specifics of the hardcore math like the overall concepts.

2022-11-18 23:09:10 RT @kareem_carr: What if Twitter dies? How are we going to keep laughing and learning about data science and statistics? I have the answer…

2022-11-18 17:52:20 @fb0904e981384bb R has more statistics packages and the packages are generally higher quality than Python in terms of statistical correctness. Excel is useful because a lot of regular people know how to use it so it can be a really powerful place to interface with non-data scientists.

2022-11-18 16:56:52 wow ok. looks like folks are into it! https://t.co/WI9bk1h5N5

2022-11-18 16:39:55 I won't be posting much right away but cool stuff is coming. I just to be PhDone first.

2022-11-18 16:39:54 What if Twitter dies? How are we going to keep laughing and learning about data science and statistics? I have the answer! Sign up for my substack so we can keep in touch: https://t.co/JunUocCJtD

2022-11-17 15:54:06 no data scientist has mastered all 5: 1. R 2. Python 3. javascript 4. making eye contact 5. Microsoft Excel

2022-11-17 14:58:40 coding is lowkey goated in situations where feeling like an idiot for multiple hours is the vibe.

2022-11-16 15:39:00 The larger the machine learning models, the more we will need statistics to understand them.

2022-11-16 01:11:35 which one are you?

2022-11-19 11:33:09 RT @kareem_carr: What if Twitter dies? How are we going to keep laughing and learning about data science and statistics? I have the answer…

2022-11-19 03:15:38 RT @kareem_carr: no data scientist has mastered all 5: 1. R 2. Python 3. javascript 4. making eye contact 5. Microsoft Excel