Chelsea Finn

Profil AI Expert

Nationalité: 
Américain(e)
AI spécialité: 
Robotique
Apprentissage par renforcement
Occupation actuelle: 
Chercheur, Université de Standford Google Brain
Taux IA (%): 
69.59'%'

TwitterID: 
@chelseabfinn
Tweet Visibility Status: 
Public

Description: 
Chercheur en informatique , Chelsea étudies l'intelligence artificielle aux travers de la robotique et des interactions à grosses échelles avec l'humain. Elle a de nombreux traveaux visant à faire que le robot apprenne de lui même en interagissant avec son environnement. Chelsea intervient parfois dans la newsletter The batch de l'Expert Andrew Ng.

Reconnu par:

Non Disponible

Les derniers messages de l'Expert:

Tweet list: 

2024-03-01 00:00:00 CAFIAC FIX

2024-03-11 00:00:00 CAFIAC FIX

2023-05-19 19:00:00 CAFIAC FIX

2023-05-21 19:00:00 CAFIAC FIX

2023-04-21 00:00:01 CAFIAC FIX

2023-04-01 15:50:50 In light of tremendous AI advances &

2023-03-27 22:31:54 @danijarh Thanks @danijarh! The demos include ~15cm of object position variation, and the policies generalize to that degree. Need more &

2023-03-27 17:04:26 But wait, there’s more! Website &

2023-03-27 17:04:25 Can the robot do all this on its own? We train the robot to predict actions in chunks, rather than one at a time. Recipe: action chunking + transformers + only 50 demonstrations The robot *autonomously* completes fine manipulation skills. https://t.co/x57dGirzl0

2023-03-27 17:04:23 First, the hardware: We use simple puppeteering. Just copy the joint angles from the leader to the follower robot. No tactile or force feedback. The manufacturer (@trossenrobotics) didn’t know these tasks were possible. https://t.co/OonfUnLXyc

2023-03-27 17:04:22 We introduce a system for fine-grained robotic manipulation! What’s new? * We can control cheap robots to do surprisingly dexterous tasks * New technique that allows robots to learn fine motor skills A short thread https://t.co/frEOm9BtlX

2023-03-27 02:24:36 @simonkalouche One example I'm aware of shows that it's a lot harder (but still possible) for people to light a match without a sense of touch: https://t.co/NPCVgcfaW5

2023-03-22 20:02:43 I had fun chatting with Pieter on @therobotbrains podcast! Check out the episode for my perspectives on big challenges in AI, research I'm excited about, and other misc topics. https://t.co/IfTBbvSdo6

2023-03-10 23:04:57 The order of features in a neural net doesn’t affect its function. But, hypernetworks &

2023-03-08 21:48:33 Recently gave a talk at Harvard @hseas on how neural nets make stuff up &

2023-03-05 14:33:56 @owais_chunawala We are far from being able to do egg peeling autonomously with this robot. But, there are other tasks the robot can do by itself, e.g. see the video below https://t.co/2Iuch6E330

2023-03-05 10:00:00 CAFIAC FIX

2023-03-02 22:00:00 CAFIAC FIX

2023-02-27 20:47:08 On a previously proposed simulation benchmark, NeRF-based augmentation provides strong improvements. It even outperforms methods that make additional assumptions. https://t.co/7Ddlte388O

2023-02-27 20:47:07 For wrist cameras, changes in arm pose correspond to novel viewpoints. Thus, we can: 1. Collect some demonstrations 2. Train a NeRF for each demo 3. Use each NeRF to generate corrective perturbations 4. Train policy on augmented data https://t.co/xb8n9DVn7N

2023-02-27 20:47:06 Turns out NeRFs are super useful for robot grasping! We use NeRFs for data augmentation, for imitation learning with wrist cameras. ->

2023-02-27 20:35:34 RT @siddkaramcheti: How can we use language supervision to learn better visual representations for robotics? Introducing Voltron: Language…

2023-02-27 01:00:00 CAFIAC FIX

2023-02-16 03:18:45 Thank you @SloanFoundation for recognizing and supporting our research! I'm grateful to have the opportunity to work with amazing students. The fellowship will support them. https://t.co/6QKbRbJG8F

2023-02-14 03:25:39 Not sure how best to use your pre-trained model? Try projecting your features onto a low-dim basis before adding a linear head. A fun collab with @_anniechen_ @yoonholeee @setlur_amrith and @svlevine, which arose from convos at @NeurIPSConf https://t.co/xHRXMfZjrY

2023-02-01 04:14:59 Want to try out DetectGPT? We just released a demo — try it out here: https://t.co/nlnX8tSx0a See if you can fool it and let us know your feedback. https://t.co/oN0kR7iG6K

2023-01-30 01:00:00 CAFIAC FIX

2022-12-08 13:00:00 CAFIAC FIX

2022-12-07 08:00:00 CAFIAC FIX

2022-11-08 06:41:44 I gave a talk in the @CMU_Robotics seminar on* robot generalization via broader data* generalizing beyond the train env through adaptation &

2022-11-02 01:13:00 Check out the paper for more analysis, experimental comparisons, &

2022-11-02 01:12:59 Multiple works have made progress in endowing robots with greater autonomy during learning.But most assume the environment is fully reversible, i.e. that it is possible for a robot to recover from a mistake.What if the robot pushes an object out of reach or flips over? https://t.co/iMOdIP9zO6

2022-11-02 01:12:57 Tired of constantly monitoring your robot learning?RL is supposed to allow robots to learn on their own, but, in practice, the robot needs constant oversight!PAINT allows robots to *proactively* ask for interventions.#NeurIPS2022 paper: https://t.co/UVfzwm4OHeA short https://t.co/Mpt6meO1DP

2022-10-27 04:56:42 @deliprao @RylanSchaeffer @yoonholeee @_anniechen_ @FahimTajwar10 @HuaxiuYaoML @ananyaku @percyliang Of course, there is possibly a whole hierarchy of latent features. And the results suggest that perhaps oranges vs. lemons (from BREEDS Entity-30) are latent features closer to Y than to X.

2022-10-27 04:55:55 @deliprao @RylanSchaeffer @yoonholeee @_anniechen_ @FahimTajwar10 @HuaxiuYaoML @ananyaku @percyliang Let Z be latent features where P(X,Y) = \int P(X,Y,Z) dz &

2022-10-27 02:19:16 @RylanSchaeffer @yoonholeee @_anniechen_ @FahimTajwar10 @HuaxiuYaoML @ananyaku @percyliang Hard to categorize shifts, but:- "input-level" shifts (eg CIFAR->

2022-10-27 02:00:36 @RylanSchaeffer @yoonholeee @_anniechen_ @FahimTajwar10 @HuaxiuYaoML @ananyaku @percyliang Thanks for the pointer! Hadn’t seen it &

2022-10-27 01:29:52 The paper has more:- an example where first layer fine-tuning provably outperforms last layer or full fine-tuning- different metrics for determining which layer to fine-tuneA wonderful collab w/ @yoonholeee @_anniechen_ @FahimTajwar10 @ananyaku @HuaxiuYaoML @percyliang https://t.co/N1px1Dxq2I

2022-10-27 01:29:51 Why might this be the case?We don't know.But, perhaps neural nets approximately invert the causal process &

2022-10-27 01:29:50 One of the most reliable ways to handle distr. shift is to fine-tune on a small amt. of data.We find that the best layers to fine-tune depends on the *type* of shift!Compared to fine-tuning the whole network, fine-tuning just one block achieves similar or higher accuracy. https://t.co/zp0pD3omv8

2022-10-27 01:29:49 Common fine-tuning wisdom is to adapt the last layer or the entire neural net.We find that, sometimes, fine-tuning *only* the first layers or middle layers works best.Paper: https://t.co/68rHObOf7FA short https://t.co/sxq00gWdML

2022-10-19 18:44:56 Mixup now works for regression!Code: https://t.co/7XVeamEi1ZPaper: https://t.co/AW4meCB4KD https://t.co/LBcUnQTzh4

2022-10-19 03:08:10 Guiding the agent towards prior experiences during fine-tuning helps the agent recover when stuck. (And, the cheetah learns to reach the goal in one episode!)It also leads to higher success &

2022-10-19 03:08:08 Simply fine-tuning with RL doesn't work well.For example, when pre-training the half-cheetah w/o obstacles and fine-tuning in a new env with obstacles, it gets stuck &

2022-10-19 03:08:06 Unlike other RL problems:* The goal is to solve the task once, rather than learning a policy* If the robot enters new states &

2022-10-19 03:08:05 Can robots adapt on the fly when deployed?Our paper studies *single life RL*, where an agent must adapt a solve a new scenario in just one episode.#NeurIPS2022 paper, led by @_anniechen_, w/ @archit_sharma97, @svlevine https://t.co/3piPZy2JKs

2022-09-20 04:22:46 RT @WIRED: Why is it easier for a robot to perform complex calculations than it is for it to pick up a solo cup? We asked computer scient…

2022-09-20 03:27:00 Why are the motor skills of a toddler so hard for robots to develop? @WIRED challenged me to explain Moravec’s Paradox at five levels of difficulty.A fun and accessible video, also featuring @mcxfrank and our very own LoCoBot .https://t.co/GiHYp2DMSy

2022-09-15 19:40:57 RT @yoonholeee: We're organizing the second Workshop on Distribution Shifts (DistShift) at #NeurIPS2022, which will bring together research…

2022-08-05 20:28:00 @Diyi_Yang @Stanford Welcome @Diyi_Yang!

2022-08-04 20:30:03 Kevin is the first PhD student from the IRIS lab (https://t.co/Xm7ZmlpYCB) to defend their thesis.For more of his work, check out his website: https://t.co/FiMNbSs9MUI'm proud to have advised him over the past several years &

2022-08-04 20:30:02 Congratulations to @TianheYu who defended his PhD thesis this week!His work includes:- Meta-World https://t.co/4pJ06QWoB0- offline model-based RL methods like MOPO and COMBO https://t.co/aZLgoueL8k- methods for using unlabeled data in offline RL https://t.co/slXiLyeTCO https://t.co/pbiDoyvitC

2022-07-15 22:56:42 The method is also quite simple to implement.Code: https://t.co/yVHkSgjkvn#ICML2022 Paper: https://t.co/cvV1itWux9WILDS Leaderboard: https://t.co/PqRetlSIxnSee Huaxiu's thread for much more!https://t.co/Gi7N2iMC1l(3/3)

2022-07-15 22:56:41 Prior methods encourage domain-invariant *representations*.This constrains the model's internal representationBy using mixup to interpolate within &

2022-07-15 22:56:40 Neural nets are brittle under domain shift &

2022-07-12 17:42:43 @judyefan @UCSD @StanfordPsych @Stanford @UCSDPsychology Welcome @judyefan!

2022-07-07 17:57:57 @PangWeiKoh @uwcse @GoogleAI @_beenkim Congrats @PangWeiKoh!! Really looking forward to your future research.

2022-06-16 04:13:30 We also show how model editors like SERAC can be used to change model sentiment on various topics.See the paper for more details &

2022-06-16 04:13:29 We find that SERAC can edit successfully without adversely affecting the model on out-of-scope examples.Try out the demo to see for yourself how these methods compare!https://t.co/UmgYgKoeA6(4/5) https://t.co/ACYTJzQvwW

2022-06-16 04:13:28 SERAC decomposes editing into two parts:1. is the test input *in-scope* for any of the edits?2. if so, how should the edit affect the prediction?These two components can be trained separately, and form a wrapper around a base model.(3/5) https://t.co/LDZL7nu6I2

2022-06-16 04:13:27 Following ENN (https://t.co/yyuBdVlhRI) and MEND (https://t.co/iJCrvbcga3), SERAC learns a model editor from data.Unfortunately, past methods struggle to make precise edits on hard in-scope &

2022-06-16 04:13:25 Want to edit a large language model?SERAC is a new model editor that can:* update factual info* selectively change model sentiment* scale to large models &

2022-06-01 15:08:55 RT @du_maximilian: Super excited to share our work on using audio to help with visually occluded tasks, like extracting keys from a paper b…

2022-06-01 04:50:49 Tagging all of the authors this time:@du_maximilian, @olivia_y_lee, @SurajNair_1

2022-06-01 03:40:34 For more, check out:Paper: https://t.co/wm3r43trdvWebsite: https://t.co/Iivdu04hfOVideo: https://t.co/ef4acItKmK(4/4)

2022-06-01 03:38:52 Experiments show:* Audio+vision outperforms vision or audio alone on tasks involving occlusion (see plot)* Audio allows the robot to distinguish between different occluded objects* Audio may not be reliable for objects like cloth that make little noise when grasped(3/4) https://t.co/i72hm58wlb

2022-06-01 03:38:51 Can robots deal with occlusion?We put a microphone on a robot's gripper &

2022-05-31 18:24:56 @irenetrampoline Congratulations @irenetrampoline!

2022-05-20 16:53:45 I'm excited for the RSS Workshop on Learning from Diverse, Offline Data.https://t.co/rGjaJwUxkFAwesome speakers include @ericjang11, @svlevine, @davsca1, and @wucathy.We extended the deadline to **May 27** if you're interested in submitting! https://t.co/dclbWoIXdz

2022-05-20 08:11:00 CAFIAC FIX

2022-10-27 04:56:42 @deliprao @RylanSchaeffer @yoonholeee @_anniechen_ @FahimTajwar10 @HuaxiuYaoML @ananyaku @percyliang Of course, there is possibly a whole hierarchy of latent features. And the results suggest that perhaps oranges vs. lemons (from BREEDS Entity-30) are latent features closer to Y than to X.

2022-10-27 04:55:55 @deliprao @RylanSchaeffer @yoonholeee @_anniechen_ @FahimTajwar10 @HuaxiuYaoML @ananyaku @percyliang Let Z be latent features where P(X,Y) = \int P(X,Y,Z) dz &

2022-10-27 02:19:16 @RylanSchaeffer @yoonholeee @_anniechen_ @FahimTajwar10 @HuaxiuYaoML @ananyaku @percyliang Hard to categorize shifts, but:- "input-level" shifts (eg CIFAR->

2022-10-27 02:00:36 @RylanSchaeffer @yoonholeee @_anniechen_ @FahimTajwar10 @HuaxiuYaoML @ananyaku @percyliang Thanks for the pointer! Hadn’t seen it &

2022-10-27 01:29:52 The paper has more:- an example where first layer fine-tuning provably outperforms last layer or full fine-tuning- different metrics for determining which layer to fine-tuneA wonderful collab w/ @yoonholeee @_anniechen_ @FahimTajwar10 @ananyaku @HuaxiuYaoML @percyliang https://t.co/N1px1Dxq2I

2022-10-27 01:29:51 Why might this be the case?We don't know.But, perhaps neural nets approximately invert the causal process &

2022-10-27 01:29:50 One of the most reliable ways to handle distr. shift is to fine-tune on a small amt. of data.We find that the best layers to fine-tune depends on the *type* of shift!Compared to fine-tuning the whole network, fine-tuning just one block achieves similar or higher accuracy. https://t.co/zp0pD3omv8

2022-10-27 01:29:49 Common fine-tuning wisdom is to adapt the last layer or the entire neural net.We find that, sometimes, fine-tuning *only* the first layers or middle layers works best.Paper: https://t.co/68rHObOf7FA short https://t.co/sxq00gWdML

2022-10-19 18:44:56 Mixup now works for regression!Code: https://t.co/7XVeamEi1ZPaper: https://t.co/AW4meCB4KD https://t.co/LBcUnQTzh4

2022-10-27 04:56:42 @deliprao @RylanSchaeffer @yoonholeee @_anniechen_ @FahimTajwar10 @HuaxiuYaoML @ananyaku @percyliang Of course, there is possibly a whole hierarchy of latent features. And the results suggest that perhaps oranges vs. lemons (from BREEDS Entity-30) are latent features closer to Y than to X.

2022-10-27 04:55:55 @deliprao @RylanSchaeffer @yoonholeee @_anniechen_ @FahimTajwar10 @HuaxiuYaoML @ananyaku @percyliang Let Z be latent features where P(X,Y) = \int P(X,Y,Z) dz &

2022-10-27 02:19:16 @RylanSchaeffer @yoonholeee @_anniechen_ @FahimTajwar10 @HuaxiuYaoML @ananyaku @percyliang Hard to categorize shifts, but:- "input-level" shifts (eg CIFAR->

2022-10-27 02:00:36 @RylanSchaeffer @yoonholeee @_anniechen_ @FahimTajwar10 @HuaxiuYaoML @ananyaku @percyliang Thanks for the pointer! Hadn’t seen it &

2022-10-27 01:29:52 The paper has more:- an example where first layer fine-tuning provably outperforms last layer or full fine-tuning- different metrics for determining which layer to fine-tuneA wonderful collab w/ @yoonholeee @_anniechen_ @FahimTajwar10 @ananyaku @HuaxiuYaoML @percyliang https://t.co/N1px1Dxq2I

2022-10-27 01:29:51 Why might this be the case?We don't know.But, perhaps neural nets approximately invert the causal process &

2022-10-27 01:29:50 One of the most reliable ways to handle distr. shift is to fine-tune on a small amt. of data.We find that the best layers to fine-tune depends on the *type* of shift!Compared to fine-tuning the whole network, fine-tuning just one block achieves similar or higher accuracy. https://t.co/zp0pD3omv8

2022-10-27 01:29:49 Common fine-tuning wisdom is to adapt the last layer or the entire neural net.We find that, sometimes, fine-tuning *only* the first layers or middle layers works best.Paper: https://t.co/68rHObOf7FA short https://t.co/sxq00gWdML

2022-10-19 18:44:56 Mixup now works for regression!Code: https://t.co/7XVeamEi1ZPaper: https://t.co/AW4meCB4KD https://t.co/LBcUnQTzh4

2022-11-17 21:54:36 DreamGrader can also find challenging bugs like "ball skewering" in a separate Breakout assignment. Check out the paper for more! Paper: https://t.co/L9KpcVysk9 Code: https://t.co/qLhzb1TuT9 https://t.co/aWz7PMM8hA

2022-11-17 21:54:34 On real student programs from https://t.co/xkVMLThhha, DREAM achieves near human-level accuracy. BUT, there is room for improvement, esp F1 score, so we are also proposing this problem as an open-sourced benchmark for future meta-RL research! https://t.co/0DySGVtE5i

2022-11-17 21:54:33 We can frame the problem of finding bugs in programs &

2022-11-17 21:54:31 Interactive student assignments, e.g. programming games or websites, are an engaging way to learn how to code! But, giving students feedback on those assignments is tedious &

2022-11-17 21:54:30 Excited to share our #NeurIPS2022 oral: We leverage techniques from meta-RL to give feedback on interactive student programs, reaching within 1.5% of human accuracy. Paper &

2022-11-17 21:54:36 DreamGrader can also find challenging bugs like "ball skewering" in a separate Breakout assignment. Check out the paper for more! Paper: https://t.co/L9KpcVysk9 Code: https://t.co/qLhzb1TuT9 https://t.co/aWz7PMM8hA

2022-11-17 21:54:34 On real student programs from https://t.co/xkVMLThhha, DREAM achieves near human-level accuracy. BUT, there is room for improvement, esp F1 score, so we are also proposing this problem as an open-sourced benchmark for future meta-RL research! https://t.co/0DySGVtE5i

2022-11-17 21:54:33 We can frame the problem of finding bugs in programs &

2022-11-17 21:54:31 Interactive student assignments, e.g. programming games or websites, are an engaging way to learn how to code! But, giving students feedback on those assignments is tedious &

2022-11-17 21:54:30 Excited to share our #NeurIPS2022 oral: We leverage techniques from meta-RL to give feedback on interactive student programs, reaching within 1.5% of human accuracy. Paper &

2022-11-17 21:54:36 DreamGrader can also find challenging bugs like "ball skewering" in a separate Breakout assignment. Check out the paper for more! Paper: https://t.co/L9KpcVysk9 Code: https://t.co/qLhzb1TuT9 https://t.co/aWz7PMM8hA

2022-11-17 21:54:34 On real student programs from https://t.co/xkVMLThhha, DREAM achieves near human-level accuracy. BUT, there is room for improvement, esp F1 score, so we are also proposing this problem as an open-sourced benchmark for future meta-RL research! https://t.co/0DySGVtE5i

2022-11-17 21:54:33 We can frame the problem of finding bugs in programs &

2022-11-17 21:54:31 Interactive student assignments, e.g. programming games or websites, are an engaging way to learn how to code! But, giving students feedback on those assignments is tedious &

2022-11-17 21:54:30 Excited to share our #NeurIPS2022 oral: We leverage techniques from meta-RL to give feedback on interactive student programs, reaching within 1.5% of human accuracy. Paper &

2022-11-17 21:54:36 DreamGrader can also find challenging bugs like "ball skewering" in a separate Breakout assignment. Check out the paper for more! Paper: https://t.co/L9KpcVysk9 Code: https://t.co/qLhzb1TuT9 https://t.co/aWz7PMM8hA

2022-11-17 21:54:34 On real student programs from https://t.co/xkVMLThhha, DREAM achieves near human-level accuracy. BUT, there is room for improvement, esp F1 score, so we are also proposing this problem as an open-sourced benchmark for future meta-RL research! https://t.co/0DySGVtE5i

2022-11-17 21:54:33 We can frame the problem of finding bugs in programs &

2022-11-17 21:54:31 Interactive student assignments, e.g. programming games or websites, are an engaging way to learn how to code! But, giving students feedback on those assignments is tedious &

2022-11-17 21:54:30 Excited to share our #NeurIPS2022 oral: We leverage techniques from meta-RL to give feedback on interactive student programs, reaching within 1.5% of human accuracy. Paper &

2022-11-17 21:54:36 DreamGrader can also find challenging bugs like "ball skewering" in a separate Breakout assignment. Check out the paper for more! Paper: https://t.co/L9KpcVysk9 Code: https://t.co/qLhzb1TuT9 https://t.co/aWz7PMM8hA

2022-11-17 21:54:34 On real student programs from https://t.co/xkVMLThhha, DREAM achieves near human-level accuracy. BUT, there is room for improvement, esp F1 score, so we are also proposing this problem as an open-sourced benchmark for future meta-RL research! https://t.co/0DySGVtE5i

2022-11-17 21:54:33 We can frame the problem of finding bugs in programs &

2022-11-17 21:54:31 Interactive student assignments, e.g. programming games or websites, are an engaging way to learn how to code! But, giving students feedback on those assignments is tedious &

2022-11-17 21:54:30 Excited to share our #NeurIPS2022 oral: We leverage techniques from meta-RL to give feedback on interactive student programs, reaching within 1.5% of human accuracy. Paper &

2022-11-17 21:54:36 DreamGrader can also find challenging bugs like "ball skewering" in a separate Breakout assignment. Check out the paper for more! Paper: https://t.co/L9KpcVysk9 Code: https://t.co/qLhzb1TuT9 https://t.co/aWz7PMM8hA

2022-11-17 21:54:34 On real student programs from https://t.co/xkVMLThhha, DREAM achieves near human-level accuracy. BUT, there is room for improvement, esp F1 score, so we are also proposing this problem as an open-sourced benchmark for future meta-RL research! https://t.co/0DySGVtE5i

2022-11-17 21:54:33 We can frame the problem of finding bugs in programs &

2022-11-17 21:54:31 Interactive student assignments, e.g. programming games or websites, are an engaging way to learn how to code! But, giving students feedback on those assignments is tedious &

2022-11-17 21:54:30 Excited to share our #NeurIPS2022 oral: We leverage techniques from meta-RL to give feedback on interactive student programs, reaching within 1.5% of human accuracy. Paper &

2022-11-17 21:54:36 DreamGrader can also find challenging bugs like "ball skewering" in a separate Breakout assignment. Check out the paper for more! Paper: https://t.co/L9KpcVysk9 Code: https://t.co/qLhzb1TuT9 https://t.co/aWz7PMM8hA

2022-11-17 21:54:34 On real student programs from https://t.co/xkVMLThhha, DREAM achieves near human-level accuracy. BUT, there is room for improvement, esp F1 score, so we are also proposing this problem as an open-sourced benchmark for future meta-RL research! https://t.co/0DySGVtE5i

2022-11-17 21:54:33 We can frame the problem of finding bugs in programs &

2022-11-17 21:54:31 Interactive student assignments, e.g. programming games or websites, are an engaging way to learn how to code! But, giving students feedback on those assignments is tedious &

2022-11-17 21:54:30 Excited to share our #NeurIPS2022 oral: We leverage techniques from meta-RL to give feedback on interactive student programs, reaching within 1.5% of human accuracy. Paper &

2022-11-17 21:54:36 DreamGrader can also find challenging bugs like "ball skewering" in a separate Breakout assignment. Check out the paper for more! Paper: https://t.co/L9KpcVysk9 Code: https://t.co/qLhzb1TuT9 https://t.co/aWz7PMM8hA

2022-11-17 21:54:34 On real student programs from https://t.co/xkVMLThhha, DREAM achieves near human-level accuracy. BUT, there is room for improvement, esp F1 score, so we are also proposing this problem as an open-sourced benchmark for future meta-RL research! https://t.co/0DySGVtE5i

2022-11-17 21:54:33 We can frame the problem of finding bugs in programs &

2022-11-17 21:54:31 Interactive student assignments, e.g. programming games or websites, are an engaging way to learn how to code! But, giving students feedback on those assignments is tedious &

2022-11-17 21:54:30 Excited to share our #NeurIPS2022 oral: We leverage techniques from meta-RL to give feedback on interactive student programs, reaching within 1.5% of human accuracy. Paper &

2022-11-17 21:54:36 DreamGrader can also find challenging bugs like "ball skewering" in a separate Breakout assignment. Check out the paper for more! Paper: https://t.co/L9KpcVysk9 Code: https://t.co/qLhzb1TuT9 https://t.co/aWz7PMM8hA

2022-11-17 21:54:34 On real student programs from https://t.co/xkVMLThhha, DREAM achieves near human-level accuracy. BUT, there is room for improvement, esp F1 score, so we are also proposing this problem as an open-sourced benchmark for future meta-RL research! https://t.co/0DySGVtE5i

2022-11-17 21:54:33 We can frame the problem of finding bugs in programs &

2022-11-17 21:54:31 Interactive student assignments, e.g. programming games or websites, are an engaging way to learn how to code! But, giving students feedback on those assignments is tedious &

2022-11-17 21:54:30 Excited to share our #NeurIPS2022 oral: We leverage techniques from meta-RL to give feedback on interactive student programs, reaching within 1.5% of human accuracy. Paper &

2022-03-12 08:11:00 CAFIAC FIX 2022-01-20 07:13:55 A very nice overview &amp 2022-01-17 08:11:00 CAFIAC FIX 2022-01-11 08:11:00 CAFIAC FIX 2021-12-27 08:20:00 CAFIAC FIX 2021-12-10 07:11:34 Using a learned dynamics model also makes it possible to transfer to new tasks without any additional environment interaction. https://t.co/qAjzmLrMFC 2021-12-10 07:11:31 Adversarial imitation learning from images is notoriously difficult b/c of instabilities from off-policy RL with a non-stationary reward V-MAIL is a stable, efficient algo using latent dynamics models Paper+Code: https://t.co/74BocYj8RB w @rmrafailov @TianheYu @aravindr93 https://t.co/PwjcbuqpnO 2021-12-07 02:02:03 Compared to directly encoding symmetries, this approach: * makes it possible to learn partial or imperfect symmetries * decouples the learned symmetries from the model &amp The results are also cool. :) https://t.co/1oIRqzbxZ6 2021-12-07 02:02:02 Can NNs learn and enforce symmetries on their predictions? Inspired by Noether’s Thm, we do so by learning quantities that should be conserved. This method can recover known conservation laws in scientific data &amp Paper: https://t.co/0RIeggSmsW https://t.co/mKgAwBkQ2Y 2021-11-06 23:20:00 CAFIAC FIX 2021-11-01 19:20:00 CAFIAC FIX 2021-11-01 17:30:00 CAFIAC FIX 2021-08-05 00:56:51 RT @percyliang: Excited about the workshop that @RishiBommasani and I are co-organizing on foundation models (the term we're using to descr… 2021-08-03 16:16:09 @risi1979 @jachiam0 @jackclarkSF Nice work! I'm curious if you tried comparing it to any other safe RL approaches? The idea of having two policies also reminds me a bit of this paper: https://t.co/hHS3gjRVw9, though the details are quite different. 2021-07-23 22:15:07 @abhishekunique7 @uwcse @berkeley_ai @svlevine @pabbeel Congratulations Abhishek! :) 2021-07-21 03:08:49 RT @LecJackS: ProtoTransformer Networks (Meta-learning + Transformers) to automatically give feedback on programming exercises to students.… 2021-07-20 21:02:15 10/ It’s super exciting to see both real-world impact of meta-learning algorithms + substantive progress on AI for education Post: https://t.co/UQqK4IgNOm Paper: https://t.co/E51wORBn0u Coverage: https://t.co/gzIz8pWsDj 2021-07-20 21:02:14 8/ We also did several checks for bias. Among the countries &amp Not too surprising given the model only sees typed python code w/o comments, but super important to check. https://t.co/PNr7IZYAQL 2021-07-20 21:02:12 7/ Most importantly, this model was deployed to 16,000 student solutions in Code-in-Place, where it was previously not possible to give feedback. In a randomized blind A/B test, students preferred model feedback slightly *more* than human feedback AND rated usefulness as 4.6/5. https://t.co/6AGrjSKT36 2021-07-20 21:02:11 6/ How well does this work? In offline expts, meta-learning: * achieves 8%-21% greater accuracy than supervised learning * comes within 8% of a human TA on held-out exams. Ablations show a &gt 2021-07-20 21:02:10 5/ Because this is open-ended Python code, our base architecture is transformers + prototypical networks. But, there are many important details for this to *actually* work: task augmentation, question &amp 2021-07-20 21:02:09 3/ Providing feedback is also hard for ML: not a ton of data, teachers frequently change their assignments, and student solutions are open-ended and long-tailed. Supervised learning doesn’t work. We weren’t sure if this problem can even be solved using ML. https://t.co/3ttJVPmNY3 2021-07-20 21:02:08 2/ Student feedback is a fundamental problem in scaling education. Providing good feedback is hard: existing approaches provide canned responses, cryptic error messages, or simply provide the answer. https://t.co/vlysnFYcQj 2021-07-20 21:02:07 Thrilled to share new work on AI for education: can we give detailed, high-quality feedback to students? Post: https://t.co/UQqK4IgNOm NYT Coverage: https://t.co/gzIz8pWsDj A collab w. the amazing @mike_h_wu @chrispiech &amp 2021-07-20 06:48:57 @NafiseSadat Thanks @NafiseSadat! We'll check it out. 2021-07-20 04:26:33 @jacobeisenstein While it appears similar, the details are different: - Only one model remains at test time - Training _more_ than twice didn’t seem to produce models that were as robust as JTT This suggests to me that the mechanisms underlying boosting and JTT are quite different 2021-07-20 03:16:54 Check out the paper for much more analysis on why JTT works &amp Led by Evan Liu, @Behzadhaghgoo, @_anniechen_ 2021-07-20 03:16:53 Does your neural network struggle with spurious correlations? Check out Evan’s long talk at #ICML2021 on why they should just train twice (JTT). Paper: https://t.co/MBrgmvyqLB Talk: https://t.co/Xr3q0oZlR2 Code: https://t.co/HhPqbhXKMh https://t.co/sYrh7SwFNG 2021-07-19 23:07:18 @jaschasd @poolio @PaulVicol @Luke_Metz Congratulations Paul, Luke, and Jascha! 2021-07-18 23:22:58 @ThomasW423 This particular paper didn't use proprietary physics engines in any of the experiments. The environment pictured in the poster was built using MiniWorld. https://t.co/zQQO0quPyk 2021-07-18 23:21:57 @xsteenbrugge We didn't explicitly test this kind of generalization--it would be interesting to see! It should find &amp 2021-07-18 20:51:41 If you are working on meta-RL, you should use DREAM. It’s simple &amp 5-min summary video: https://t.co/mxvRiq8eVQ Paper &amp Presented at #ICML2021 on Weds morning. https://t.co/GIjDh9XGEg https://t.co/qF8n7caNHY 2021-07-13 16:31:04 Interested in how in-the-wild videos of humans can help robots generalize further? Check out @_anniechen_'s presentation at #RSS2021 tmrw and Fri. Her 5 min talk https://t.co/ShgqtpHruJ https://t.co/I8rJe2EuUp 2021-06-25 05:21:58 When training video prediction models on broad datasets, they fail to even fit the training data. FitVid is a simple model that addresses this problem &amp Led by @babaeizadeh, with @msaffar3 @SurajNair_1 @svlevine @doomie https://t.co/I8zz6AxCpv 2021-05-18 00:50:55 @mengyer @nyuniversity @CILVRatNYU @NYUDataScience @zemelgroup @RaquelUrtasun Congratulations Mengye! 2021-05-12 19:25:35 @GlenBerseth @UMontrealDIRO @Mila_Quebec @svlevine @Mvandepanne Congrats Glen! 2021-05-06 19:00:00 Interested in how in-the-wild videos of humans can improve robot generalization? @_anniechen_ is giving an oral presentation at the @iclr_conf Workshop on Self-Supervision for RL. Link: https://t.co/xhMLCeMFg9 Talk: Friday 5/6, 8:30 am PT Q&amp 2021-05-03 17:15:52 Interested in how we can recover invariant and equivariant structures (like convolutions) from data using meta-learning? Come check it out at @iclr_conf Poster/Talk: https://t.co/WLPsLFAF18 Poster Session: Tuesday 5/4, 9-11 am PT https://t.co/7vlVe7FpRx https://t.co/l5IeglCyPm 2021-04-23 04:32:42 Really cool &amp 2021-04-20 03:41:16 Figuring out what form of supervision to give to robots is quite difficult. Actionable models are goal-conditioned Q-functions that allow robots to learn many skills from unlabeled offline data. Plus, they are also useful pre-training for downstream tasks! https://t.co/kElyfnrCZz https://t.co/VFzH3mgR2u 2021-04-20 00:54:45 We usually want robots to solve not just one task but many. MT-Opt allow robots to solve multiple tasks from image observations, with a diverse range of objects, using multi-task reinforcement learning. Lots of nuanced design choices went into making this work! https://t.co/ieOakqkUhn 2021-04-01 04:54:40 For more, check out: Paper: https://t.co/afz2PWw0rT Website: https://t.co/geRH3tmgTe Summary video: https://t.co/wg2C1lEsBG I'm quite excited about how reusing broad datasets can help robots generalize, and this project has been a great indication in that direction! (5/5) https://t.co/yVPtIjSUAp 2021-04-01 04:54:39 Does using human videos improve reward generalization compared to using only narrow robot data? We see: * 20% greater task success in new environments * 25% greater task success on new tasks both in simulation and on a real robot. (4/5) https://t.co/S0xfHCmh3F 2021-04-01 04:54:38 This discriminator can be used as a reward by feeding in a human video of the desired task and a video of the robot’s behavior. We use it by planning with a learned visual dynamics model. (3/5) https://t.co/yQBtzwlmNi 2021-04-01 04:54:37 To get reward functions that generalize, we train domain-agnostic video discriminators (DVD) with: * a lot of diverse human data, and * a narrow &amp The idea is super simple: predict if two videos are performing the same task or not. (2/5) https://t.co/4sRhfThzkI 2021-04-01 04:54:36 How can robots generalize to new environments &amp We find that using in-the-wild videos of people can allow learned reward functions to do so! Paper: https://t.co/afz2PWw0rT Led by @_anniechen_, @SurajNair_1 (1/5) https://t.co/5BqpzVgK31 2021-03-31 03:23:31 RT @lynetcha: Excited to announce the launch of the BlackAIR Summer Research Grant Program - providing support for AI Research projects le… 2021-03-16 06:49:29 @tetraduzione @SurajNair_1 @RobobertoMM @drfeifei Bohan anecdotally found that fully deterministic model performs a bit worse, but didn't test it extensively. He did find that maintaining stochastic latent vars at every layer performs much worse than only including a stochastic latent var at the current deepest module. 2021-03-15 17:31:26 Importantly, the improved model performance also translates to better robot task performance! We pre-train GHVAEs on RoboNet &amp By improving the model we see: 50% success -&gt (4/5) https://t.co/TNqj0RK6QK 2021-03-15 17:31:22 The GHVAE model outperforms the impressive SVG’ by @RubenEVillegas et al. and the strong Hier-VRNN model by Castrejon et al. across 4 datasets. (3/5) https://t.co/m7yJTJHoct 2021-03-15 17:31:21 Video prediction models often *underfit* when trained on broad datasets &amp Key idea of GHVAEs: Revive greedy layer-wise training, to train larger models with fewer optimization issues. (2/5) https://t.co/CdgM5lvS1z 2021-03-15 17:31:20 Predicting video is a powerful self-supervised approach for robots to learn about the world Greedy hierarchical VAEs *can train larger models with less memory *surprisingly outperform end-to-end training w Bohan Wu @SurajNair_1 @RobobertoMM, @drfeifei https://t.co/HJPzzz33K4 2021-03-13 07:10:10 @timnitGebru I'm really sorry that you have to deal with all this, Timnit. 2021-03-12 20:34:36 RT @neuro_data: I'm finally watching @chelseabfinn's fantastic lectures on multitask learning and meta-learning, and they are really just s… 2021-03-11 01:42:34 v1.1 of WILDS is out! Featuring: * a new dataset on code completion * updates to make datasets faster &amp * reproducible @CodaLabWS worksheets https://t.co/IeXLRZOtgR 2021-03-02 00:26:32 Last week, I gave a talk ‘at’ Toronto discussing 3 principles for tackling distribution shift: * pessimism * adaptation * anticipation and how they can improve robustness to spurious correlations, changes in users, and non-stationary &amp https://t.co/W8UOmHFu6H 2021-02-24 04:27:54 @WellsLucasSanto Congratulations, Wells! That's awesome to hear. 2021-02-17 06:10:14 A new algo for RL from offline data: COMBO pushes down the est. value of states &amp This leads to strong performance &amp https://t.co/aZLgoueL8k w/ @TianheYu @aviral_kumar2 @rmrafailov @aravindr93 @svlevine https://t.co/qZvUXyMUP8 2021-01-04 20:07:36 @sanmikoyejo and I are happy to announce @iclr_conf 2021 workshops! https://t.co/qS83jbkaun Topics range from robustness, to mathematical reasoning, to public health. Workshop organizers will share their workshop websites soon 2020-12-22 22:44:01 Finally, we took an offline dataset from a previous paper by @anniee268, Nam, @SurajNair_1 et al., labeled 200 images with sparse rewards, and trained the robot with LOMPO to close the drawer of a desk *without any further data*. The resulting policy works quite well. (4/4) https://t.co/5riQPc5fzg 2020-12-22 22:43:58 LOMPO learns an ensemble of latent space models &amp This avoids the challenge of modeling uncertainty in raw image space &amp (3/4) https://t.co/hb7v0xVlnh 2020-12-22 22:43:57 Interested in offline RL? Handling image observations &amp We introduce LOMPO to tackle this setting. https://t.co/8xzfNaipRQ with Rafael Rafailov, @TianheYu, @aravindr93 (1/4) https://t.co/ucfAxaGHXX 2020-12-15 21:47:00 When we previously worked on distribution shift &amp WILDS contains a variety of distribution shifts that: - lead to a drop in performance - are relevant to real-world scenarios - have some form of leverage for handling the shift 2020-12-15 21:46:59 Interested in distribution shift &amp Excited to share WILDS: https://t.co/M4PfuKjpO8 We talked to domain experts &amp By an amazing team led by @PangWeiKoh &amp 2020-12-09 20:29:36 I'm at @WiMLworkshop mentoring table #12 for the next hour and a half. Looking forward to chatting about meta-learning, AutoML, or any other questions/topics that you may have! https://t.co/uw4ym1g3hj 2020-12-09 17:35:16 Meta-learning can be used with an unsegmented stream of data, rather than a set of tasks. James is presenting this work at @NeurIPSConf right now. (9-11 am PT). https://t.co/1EvJFWUl9V 2020-12-08 17:16:12 Interested in robustness in reinforcement learning? @SaurabhKRL is presenting this paper at @NeurIPSConf right now (9-11 am PT). Video &amp Paper: https://t.co/Ggz0UH1nC9 https://t.co/FmMaQuJzN0 2020-12-08 00:01:49 Interested in offline RL? At #NeurIPS2020, @TianheYu and Garrett Thomas are presenting: MOPO: Model-Based Offline Policy Optimization Poster session is 9-11 pm PT tonight, at https://t.co/DlUQWHRe29 Video: https://t.co/KToMsslHPn https://t.co/40ztqpsE1T https://t.co/52lqcI83uz 2020-11-18 22:11:47 Congratulations to Annie Xie, @loseydp, Ryan Tolsma, and @DorsaSadigh on the best paper award at @corl_conf! It was a privilege to work with such an awesome group. Annie’s talk is here: https://t.co/pOHAr58dAl https://t.co/dQoEu2LBoo https://t.co/z2VBWogcOb 2020-11-17 19:26:37 Karl is presenting this work @corl_conf in tomorrow’s oral session, at 8:30 am PT. Paper: https://t.co/wb8xcgURuZ Site: https://t.co/MSU8fPY7MK Spotlight video: https://t.co/t9scjfTA6S 2020-11-17 19:26:36 In videos of humans, we don’t know the actions or rewards. Plus, there’s substantial domain shift. We address these with simple approaches: generating actions with an inverse model, labeling terminal states with high reward, and using domain invariance methods. (3/5) https://t.co/a6DGEAxO5E 2020-11-17 19:26:35 Getting broad and diverse data for training robots is important, but hard. Videos of people are a broad and largely untapped source of such data. (2/5) https://t.co/LMGIfqzjyh 2020-11-17 19:26:31 Can we allow robots to learn from both human experiences and their own? With a few tricks, we can simply add videos of humans to the replay buffer of a reinforcement learning agent. Paper: https://t.co/wb8xcgURuZ w/ Schmeckpeper, @_oleh, @KostasPenn, @svlevine (1/5) https://t.co/LRhQSETNZm 2020-11-16 22:19:40 LILI allows robots to learn amidst other agents, inferring their strategies &amp By doing so, a robot can learn to block air hockey shots from a non-stationary opponent. Paper: https://t.co/1L99FnrY86 A finalist for best paper at #CoRL2020. https://t.co/ImptCb9YgV https://t.co/NsxuLttmD8 2020-11-15 22:13:06 A method for autonomously discovering different skills, without a human intervening to "reset" the world. In reset-free games: One agent performs the task. The other acts to reset the world to diverse states. To appear at #NeurIPS2020 https://t.co/VYSyFbIRpW 2020-11-05 22:04:18 @DorsaSadigh Congrats Dorsa! 2020-10-31 02:04:00 We refer to this idea as structured maximum entropy RL, since the solutions need to be diverse, but controllable using a latent variable. Check out the paper for the theoretical &amp (3/3) https://t.co/ut7sTo8hqm 2020-10-31 02:03:59 Reinforcement learning can lead to behaviors that don’t generalize. In our #NeurIPS2020 paper, we introduce a simple idea to allow extrapolation to new envs w/ a few trials: One Solution is Not All You Need https://t.co/5t4M4nivVH w. Saurabh Kumar, Aviral Kumar, @svlevine (1/3) https://t.co/qp4Jksa350 2020-10-28 02:01:13 New paper: MELD allows a real robot to adapt to new goals directly from image pixels https://t.co/lyhUgA61JJ The key idea is to _meld_ meta-RL and latent dynamics models for unified state &amp w/ @tonyzzhao, A Nagabandi, K Rakelly, @svlevine to appear at @corl_conf https://t.co/tyvMLZWjIS 2020-10-23 05:00:18 A small amount of guidance can go a long way — BEE: explores relevant objects much more frequently than prior approaches works well on a real robot from images enables better offline RL on downstream tasks (7/8) https://t.co/2wJRvC75dw 2020-10-23 05:00:17 Instead, BEE: a. uses a few min of human guidance of interesting images b. learns to classify whether an image is ‘interesting’ c. explores _autonomously_ to reach states classified as interesting (5/8) https://t.co/6kkPwYGNRh 2020-10-23 05:00:16 Many approaches instead use intrinsic objectives, e.g. this cool paper by @_ramanans, @_oleh, @KostasPenn, @pabbeel, @danijarh, @pathak2206: https://t.co/63Jj1uNucc But, we found these approaches don't scale well to vision-based robotic control, e.g. for opening a drawer. (4/8) https://t.co/bavl1DhmKA 2020-10-23 05:00:12 How should we collect data? We could use a random policy, e.g. as done in RoboNet: https://t.co/EQRfwv27uV This allows for _a lot_ of data, but much of it does not contain useful interactions. (3/8) https://t.co/GxV1yo76Zx 2020-10-23 05:00:10 Offline or batch RL is increasingly popular, but where does the batch dataset come from? We introduce the notion of _batch exploration_, where we aim to autonomously collect a large batch of useful data for RL. (2/8) https://t.co/rRTLJKjO3j 2020-10-23 05:00:09 Can robots learn to autonomously explore their environment? We introduce Batch Exploration with Examples (BEE) https://t.co/8zoswdeZzZ led by @anniee268, Alex Nam, &amp Thread (1/8) https://t.co/VtZ09y0MVN 2020-10-16 01:32:36 RT @wellecks: Thanks to @chelseabfinn for the great conversation about her work on meta-learning and robotics -- check it out below! https:… 2020-10-10 01:56:49 RT @red_abebe: We're launching the @black_in_ai Academic Program, which builds on the grad app mentoring program from the past 3 years. T… 2020-09-15 02:45:27 @peteflorence @andyzengtweets @lucas_manuelli @YunzhuLiYZ @rtedrake Nice work Lucas, Yunzhu, Pete, and Russ! The visualization in your video brings back good memories from 2015. (from this paper: https://t.co/Ytctfm5O5k) https://t.co/e6v5y0Ud3C 2020-08-31 19:02:00 Evan wrote an excellent blog post on this work, which is now out on the SAIL blog: https://t.co/6HLuXDPB7u This work changed how I think about exploration in meta-RL &amp 2020-08-26 04:17:28 RT @le_james94: ~ New Post ~ During this quarantine time, I binge-watched @Stanford #CS330 lectures taught by the brilliant @chelseabfinn.… 2020-08-25 20:30:50 RT @KostasPenn: #ECCV2020 How can you predict video from both interaction and observation? Karl Schmeckpeper's paper with @_oleh Annie Xie… 2020-08-20 07:29:35 @AhmedSQRD I find it helpful to seek out &amp Best of luck! 2020-08-17 22:47:32 @red_abebe Congrats, Rediet! 2020-08-07 02:43:26 With this approach, DREAM can learn an exploration strategy that navigates a 3D environment from pixels to go “read” a sign that carries info about the task. (and then execute the task using that info) (4/5) https://t.co/4teggu5vXP 2020-08-07 02:43:25 Want your robot to explore intelligently? We study how to learn to explore &amp Paper: https://t.co/DNRJzlo8rw w Evan Liu, Raghunathan, Liang @StanfordAILab Thread(1/5) https://t.co/qcR6G1wfBk 2020-07-20 05:28:53 Congrats to Annie Xie and @jmes_harrison for receiving a best paper recognition at the #ICML2020 Workshop on Lifelong ML. (and to Jorge and Eric at @GRASPlab!) https://t.co/IWSYF3l4YN https://t.co/ad1IpoCDO4 2020-07-17 01:21:55 @jiajunwu_cs Congratulations, Jiajun! 2020-07-16 02:32:13 To learn more, check out our ICML poster session tomorrow @icmlconf at 7 am PT and 6 pm PT! Poster: https://t.co/xsCoEjo2fC Website &amp 2020-07-16 02:32:12 Some prior works, e.g. value-aware model learning (https://t.co/MNl7w4dMWL), approach this problem by using the reward function. However, rewards are hard to get in the real world. Can we get both task-oriented predictions *and* a scalable self-supervised approach? (2/5) 2020-07-16 02:32:11 In model-based RL, learning a global model of *everything* is really hard. Can we learn to model only what matters? We introduce: Goal-Aware Prediction (GAP) https://t.co/88WLo4crfU with @SurajNair_1 @silviocinguetta @StanfordAILab Thread (1/5) https://t.co/qpm0fQA5hd 2020-07-08 06:21:38 @LouisKirschAI @allan_zhou1 @TensorProduct @StanfordAILab @SchmidhuberAI @hardmaru Thanks Louis! We discuss hypernets and relevant evolutionary algorithms in the paper. Fernando et al looks relevant too! Regarding empirical comparisons, our goal wasn’t to push state-of-the-art but to validate specific hypotheses surrounding acquiring equivariance structure 2020-07-08 04:09:46 Finally, we also aim to learn symmetries from augmented image data, to essentially _bake data augmentation into the architecture_. Here, we find that MSR also performs well. (7/8) https://t.co/0puOwCYKtB 2020-07-08 04:09:45 We then ask, can we discover something *better* that convolutions when there is less or more symmetry in the data? With the same synthetic data set-up, we find that the answer is indeed yes, and that MSR can be applied on top of convolutions to learn stronger symmetries. (6/8) https://t.co/PaF64wDySm 2020-07-08 04:09:44 To think about this question, we first look at how equivariances are represented in neural nets. They can be seen as certain weight-sharing &amp (2/8) https://t.co/tTR95yAWtA 2020-07-08 04:09:43 Convolution is an example of structure we build into neural nets. Can we _discover_ convolutions &amp Excited to introduce: Meta-Learning Symmetries by Reparameterization https://t.co/Si612XfcFk w/ @allan_zhou1 @TensorProduct @StanfordAILab Thread 2020-07-07 02:34:53 Key idea: Rather than trying to be robust, we train a meta-learning algorithm to *adapt* to group distribution shift with unlabeled data. (5/6) https://t.co/JIWaAvVLKE 2020-07-07 02:34:52 Supervised ML methods (i.e. ERM) assume that train &amp To help, we introduce adaptive risk minimization (ARM): https://t.co/y3l2KCmmiB With M Zhang, H Marklund @abhishekunique7 @svlevine (1/6) 2020-07-06 22:29:02 RT @AndrewYNg: New @ICEgov policy regarding F-1 visa international students is horrible &amp 2020-06-28 03:57:34 @mgoulao1 @jmes_harrison @StanfordAILab Thanks for pointing this out! We'll try to improve this in the next version (and future papers) with different linestyles for different methods, in addition to different colors. 2020-06-27 03:01:11 By modeling and anticipating change, LILAC can do much better! We expect LILAC to work well in environments where non-stationarity can be modeled to some degree. (4/5) https://t.co/VRDlw5HeqW 2020-06-27 03:01:10 How can robots learn in changing, open-world environments? We study: Deep Reinforcement Learning amidst Lifelong Non-Stationarity https://t.co/ZgRVAgAjTU with Annie Xie, @jmes_harrison @StanfordAILab (1/5) https://t.co/jxvNu7seAt 2020-06-17 19:25:54 @paypaytr @FerranAlet @MIT The link for the talk is here: https://t.co/m7QadnDYat 2020-06-17 15:51:49 RT @FerranAlet: Today (Wednesday) at 2pm Eastern we will have our first Youtube livestream of the @MIT Embodied Intelligence Seminar! @chel… 2020-06-12 19:07:13 RT @beenwrekt: Excited to end on a high note: @chelseabfinn will give our final talk of #L4DC2020, starting at 12:15 PDT. 2020-06-08 22:41:36 RT @vj_chidambaram: The number of African Americans granted PhDs in CS in 2019 in the entire country (USA) was.. 13 10 male, 3 female. I… 2020-06-04 02:59:23 @jasminewsun @StanfordEng I'm really sorry to hear about this, Jasmine. I'm a CS prof at Stanford (teaching CS221 right now). I forwarded your thread to Michael Bernstein, who volunteered himself to help students work with faculty on making appropriate accommodations. 2020-05-28 19:25:42 @yablak @StefanoErmon @james_y_zou @svlevine @tengyuma With the exception of tabular problems/grid worlds, I believe the DICE family of algorithms has only been shown to work in online off-policy settings, rather than the fully offline setting. 2020-05-28 01:16:57 Tagging authors I missed: @TianheYu @yulantao1996 2020-05-28 01:09:41 The most surprising part of this project was that: existing model-based RL methods perform substantially better than model-free approaches in the offline setting. We have hypotheses but don't know exactly why this is, suggesting interesting questions for future work https://t.co/QRqtB8eO5C 2020-05-28 01:09:40 Offline RL may make it possible to learn behavior from large, diverse datasets (like the rest of ML). We introduce: MOPO: Model-based Offline Policy Optimization https://t.co/6MlXg4Y5qc w/ Tianhe Yu, Garrett Thomas, Lantao Yu @StefanoErmon @james_y_zou @svlevine @tengyuma 2020-05-12 05:07:06 @neoneo83775198 If you want to adapt to only K samples at test time, then the inner gradient of MAML should be computed using at most K samples. For variable K, you can compute the gradient averaged across the samples. 2020-05-06 15:44:54 RT @DeepMind: Keen to understand multi-task and meta-learning methods? Feryal suggests watching the new lecture series by @chelseabfinn, as… 2020-05-05 03:27:37 @RichardYRLi Tianhe tried an approach analogous to A-GEM in the multi-task setting, and found that it performed poorly. We didn't investigate much further though. 2020-05-04 17:07:56 Working on multi-task learning? Trying out PCGrad now requires only 1 more line of code. https://t.co/ti53rhNHZD PyTorch version coming soon. https://t.co/J1wPb9fo5d https://t.co/deekIVecvr 2020-05-02 01:24:43 Interested in unsupervised meta-RL, where agents propose their own tasks and learn how to learn them? Check out a new blog post by @abhishekunique7 and Ben Eysenbach: https://t.co/VGFLLfVdVr https://t.co/ZuJWSHW26c 2020-04-30 04:18:03 @marcgbellemare On the topic of pet peeves, I dislike phrases like "Through our experiments, we demonstrate..." The experiments should be set up to test hypotheses Authors shouldn't be set out to "demonstrate" those conclusions. 2020-04-29 02:23:20 Check out @allan_zhou1’s work on meta-learning from demos &amp 10 am &amp Poster &amp Website: https://t.co/cQKn3bxXrl https://t.co/iEE7qQTEsP 2020-04-29 02:18:36 Check out @SurajNair_1’s work on hierarchical foresight @iclr_conf! HVF tackles long-horizon vision-based tasks without any human supervision. 10 am &amp Poster &amp Website: https://t.co/4O1cAijTPH https://t.co/F02HBJIL3l 2020-04-28 17:03:07 RT @StanfordAILab: ICLR 2020 is being hosted virtually right now -- we’re excited to share all the work from SAIL that’s being presented wi… 2020-04-26 07:05:14 The @iclr_conf BeTR RL workshop starts soon! Tune in for: - many interesting contributed posters &amp - talks from Martha White, Abhishek Gupta, Ishita Dasgupta, @jeffclune - panel with the above speakers + @SchmidhuberAI https://t.co/GmNPWkxkJa 2020-04-22 17:21:16 Worried about sample efficiency of RL? Fine-tuning works for RL too. Similar to the success of ImageNet/LM fine-tuning, robots can adapt to *drastic* changes in robot morphology, lighting, and objects, when pre-trained with RL. (+ it doesn't work with ImageNet pre-training) https://t.co/3oLCLQ44lk https://t.co/TqyfewpaMv 2020-04-17 21:22:23 An excellent post-doc opportunity for AI researchers, especially for those affected by hiring freezes https://t.co/HIlIY5CRlD 2020-04-16 18:30:06 Honored to be selected for the MSR Faculty Fellowship award, which reflects not only on my work but also my students, collaborators, and advisers. Thank you Microsoft, and congratulations to my fellow fellows! https://t.co/vMscchY4T7 2020-04-10 17:54:20 RT @BeEngelhardt: JFC. Article on the vision of AI for the future. Eight men mentioned. Seven men quoted. Zero women. Zero people of color.… 2020-04-07 19:33:30 (3/5) As the env gets more complex, goal-conditioned RL struggles, while weakly-supervised control (WSC) continues to perform well. https://t.co/VhygEWgVIT 2020-04-07 19:33:29 Supervising RL is hard, especially if you want to learn many tasks. To address this, we present: Weakly-Supervised Reinforcement Learning for Controllable Behavior https://t.co/UKF7vSM5Lw with @rl_agent, Eysenbach, @rsalakhu, @shaneguML @GoogleAI thread https://t.co/Te0vfnWiiR 2020-04-01 01:51:56 I talked to Synced about my experiences as a women in the AI community + my outreach efforts to make AI accessible &amp Despite the headline, I’m optimistic about progress we've made as a community, and hope we can continue that progress! https://t.co/RgDnaC6r8G 2020-03-18 05:39:34 Learning to grasp &amp https://t.co/cPlSimGndE w/ Akhil Padmanabha, @febert8888, Stephen Tian, @RCalandra, @svlevine https://t.co/81HV71IXG1 https://t.co/3McApn1hjk 2020-03-17 03:34:11 Meta-learning applied to drug discovery, in new research from @GSK: https://t.co/L9lN9IQkPJ Gated graph neural nets + MAML predict the properties of new molecules from their chemical graph with a small amount of data. https://t.co/hQrqmVCvE8 2020-03-08 20:26:59 RT @crude2refined: Watching the meta learning course by @chelseabfinn &amp 2020-03-06 05:50:34 Imitation learning needs human demos to improve. In MILI, robots improve themselves w/ multi-task self-imitation Scalable Multi-Task Imitation Learning with Autonomous Improvement https://t.co/INcUXMRPTi w @avisingh599 @ericjang11 @AlexIrpan Kappler @mihdalal @svlevine Khansari https://t.co/VOFGDLBcy4 2020-03-05 20:02:40 Robots can use meta-learning to quickly adapt from simulation to various real world conditions. Rapidly Adaptable Legged Robots via Evolutionary Meta-Learning https://t.co/Xf02zxgUlv with Xingyou Song, Yuxiang Yang, Choromanski, Caluwaerts, Gao, Jie Tan @GoogleAI https://t.co/3x4zPpMzlo 2020-02-27 16:08:59 @jeffclune Thank you for stopping by to give a great guest lecture! 2020-02-25 19:28:51 Want to learn about meta-learning? Lecture videos for CS330 are now online! https://t.co/taJ5yyIWVQ Topics incl. MTL, few-shot learning, Bayesian meta-learning, lifelong learning, meta-RL &amp https://t.co/mJ1v71huD7 + 3 guest lectures from Kate Rakelly, @svlevine, @jeffclune https://t.co/X609iUvCXp 2020-02-18 23:11:34 Excited to be hosting @OriolVinyalsML for the afternoon! @StanfordAILab https://t.co/Jk7bmdH52c https://t.co/ekvHYmtA4X 2020-02-02 20:45:15 RT @svlevine: @OriolVinyalsML To quote Hamming: "The great scientists often make this error. They fail to continue to plant the little acor… 2020-01-22 02:41:07 Excited to share PCGrad, a super simple &amp On Meta-World MT50, PCGrad can solve *2x* more tasks than prior methods https://t.co/VF8Ldxu3A5 w/ Tianhe Yu, S Kumar, Gupta, @svlevine, @hausman_k https://t.co/uTeUhULUTA 2020-01-13 18:20:18 We're organizing a workshop on 'Beyond "Tabula Rasa" in RL' (BeTR RL) at @iclr_conf &amp Deadline: Feb 10 Invited speakers include @NandoDF, Ishita Dasgupta, Abhishek Gupta, Martha White, with @SchmidhuberAI as a panelist. https://t.co/GmNPWkxkJa 2020-01-03 22:22:36 @ChombaBupe @GRASPlab @_oleh @KostasPenn @svlevine In principle, the method can model varying backgrounds &amp Though handling large domain shift in either remains quite challenging. 2020-01-02 03:37:51 Can robots learn about the world by observing humans? Learn to predict with both interaction &amp https://t.co/pP6SOLTlCQ https://t.co/M9U7Fafvqx w. Schmeckpeper @GRASPlab, Xie, @_oleh, Tian, @KostasPenn, @svlevine https://t.co/2EeW02Dmy8 2019-12-20 05:53:45 Can we discover structure &amp Continuous Meta-Learning without Tasks https://t.co/C4KBfUZmHY w @jmes_harrison, Sharma, @MarcoPavoneSU https://t.co/fDMCd9MumV 2019-12-14 06:13:10 @Haoxiang__Wang @NeurIPSConf Slides: https://t.co/MTzOQQ2uFX Video: https://t.co/MhCJ8VbIHv 2019-12-13 17:46:32 I'll be discussing this work and other challenges in meta-learning at the @NeurIPSConf Bayesian Deep Learning Workshop at 1:20 pm, West Exhibition Hall C. https://t.co/LhMgnizzti https://t.co/VlNHLaGimu 2019-12-11 07:57:01 @abdjml11 @Mingzhangyin @georgejtucker @svlevine We didn't try inequality-based TAML, but I don't think it would help b/c we observe that the pre-update model works well on all training tasks We generally found regularization on activations &amp 2019-12-11 07:45:24 Interested RL algorithms that learn to follow instructions, use hierarchical abstractions, and achieve compositional generalization? We investigate the many benefits of language in RL. We'll be at @NeurIPSConf poster #197, at 10:45 am! w @yidingjiang @shaneguML Murphy @GoogleAI https://t.co/LIVH34EIo3 https://t.co/JDHnDBjk07 2019-12-11 07:33:28 Meta-RL relies heavily on manually-defined task distributions. CARML constructs curricula of tasks *without supervision* in the loop of meta-learning. https://t.co/vJ4tdbvVqK On Weds, Allan Jabri will present a spotlight @NeurIPSConf at 10:30 am in Hall A, poster #53 at 5:30 pm! https://t.co/pvdD8t55yg 2019-12-10 17:39:20 RT @svlevine: Can we distribute meta-RL, with local policy learners distilled into a centralized meta-policy? Find out about guided meta-po… 2019-12-10 07:24:26 We're presenting our work on meta-learning with implicit differentiation @NeurIPSConf Come find us at the Tuesday evening poster session #47, tomorrow 5:30-7:30 pm. https://t.co/el0LPJawVp https://t.co/T4FQbKSk2M 2019-12-10 06:18:25 Meta-learning has a peculiar, widespread problem that leads to terrible performance when faced with seemingly benign changes to the training set-up. We analyze this problem &amp w/ @Mingzhangyin, @georgejtucker, Zhou, @svlevine https://t.co/yIIy9fEzqv 2019-12-05 02:16:04 RT @chrmanning: .@SuryaGanguli &amp 2019-11-26 07:16:36 Sudeep Dasari wrote an excellent blog post on our work on RoboNet, cross-posted on the SAIL blog: https://t.co/alVOrRWXkh We imagine a future where robots share data *across* research labs, just like the rest of machine learning. https://t.co/6fNDCLM1DJ https://t.co/c77Sxl0Vqt 2019-11-22 04:22:02 RT @EmmaBrunskill: Delighted to share our Science article on making it easier ensure AI systems satisfy societal values. https://t.co/qvDgM… 2019-11-12 07:39:58 I'm on my way to @Khipu_AI in Montevideo, Uruguay. Looking forward to giving a talk on ML for robots, joining the Women in AI panel, and meeting talented researchers! https://t.co/qhsbbEgc1r https://t.co/RZDMMNThTQ 2019-10-25 03:46:29 Tired of your robot learning from scratch? We introduce RoboNet: a dataset that enables fine-tuning to new views, new envs, &amp https://t.co/yU7QQF1gnq https://t.co/5yHf53BnyV w/ Dasari @febert8888 Tian @SurajNair_1 Bucher Schmeckpeper Singh @svlevine https://t.co/BC82ZBx8YX 2019-10-25 00:43:04 The Meta-World paper is now out! Includes an eval of 8 methods &amp https://t.co/Ns3oDJG9Wz We look forward to seeing how your new algorithms fare on the suite of 50 tasks. https://t.co/SJ5BAiPnRO https://t.co/9bU9PdaouM 2019-10-17 07:18:16 @jacobmbuckman @ilyasut @gdb @lilianweng @ludwigschubert Meta-RL problems are [usually] a special case of a POMDP where there is a single latent variable that is not time-varying nor affected by actions @svlevine and I discussed this connection in our ICML tutorial: https://t.co/i2PkrXb53p I don't know what "emergent meta-learning" is https://t.co/nQkE1sURX0 2019-09-26 04:15:05 Excited to share that I’m teaching a *new course* on multi-task &amp , etc Slides &amp https://t.co/mJ1v71huD7 https://t.co/Gmve2kUvHW 2019-09-24 04:27:51 @OriolVinyalsML @stanfordnlp @maithra_raghu I've also observed this to be the case with few-shot image classification! I think this might be telling us more about the benchmarks than the algorithms. I look forward to seeing how these methods perform as we start to evaluate them on much harder problems. 2019-09-19 20:09:03 @TristanDeleu @fhuszar We discuss some of the relationships between iMAML and other algorithms like Reptile and first-order MAML in Appendix A of the paper. For example, you can recover first-order MAML from iMAML with zero CG steps: https://t.co/dds0FRgyKV 2019-09-19 17:58:55 An accessible blog post and nice visualizations by @fhuszar on our NeurIPS paper on meta-learning with implicit gradients! https://t.co/byGBlZfIRR 2019-09-15 20:02:09 New paper: Hierarchical visual foresight learns to generate visual subgoals to break down a goal into smaller pieces. Accomplishes long-horizon vision-based tasks, without *any* supervision. w/ Suraj Nair @GoogleAI Paper: https://t.co/ngKjPAJDeK Code: https://t.co/0qoTc4oWGb https://t.co/y7U5BfKvAz 2019-09-12 17:01:24 @ravitej_17 @NeurIPSConf @aravindr93 @ShamKakade6 @svlevine Yes, the proximal term is important for imposing the learned prior on the optimization. The weight of the proximal term can be adjusted to control the weight of the prior, either as a hyperparameter or, in principle, as a learned [meta] parameter. 2019-09-12 01:04:31 @danijarh @hausman_k @svlevine Clarification: We include an API for getting image observations for all of the environments, but have only tested RL without images so far. 2019-09-11 18:52:10 Evaluating multi-task &amp Introducing Meta-World https://t.co/F5Huc0SYBn Benchmark including *50* simulated manipulation tasks Paper &amp w/ T Yu, D Quillen, Z He, R Julian, @hausman_k @svlevine https://t.co/TLHKma28s2 2019-09-11 01:32:32 It's hard to scale meta-learning to long inner optimizations. We introduce iMAML, which meta-learns *without* differentiating through the inner optimization path using implicit differentiation. https://t.co/hOSi3Irymh to appear @NeurIPSConf w/ @aravindr93 @ShamKakade6 @svlevine https://t.co/fBznTaubgr 2019-09-03 06:52:02 @yukez @UTAustin @UTCompSci Congrats @yukez and @UTAustin ! 2019-08-27 15:50:10 @ImranSHaque @RecursionPharma Congrats Imran! 2019-08-13 00:41:55 RT @pabbeel: The website for the #NeurIPS2019 Deep RL Workshop is now live! Paper submissions due September 9th. Co-organized with @jach… 2019-07-31 00:32:43 RT @ZhitingHu: We're organizing #NeurIPS2019 workshop on Learning with Rich Experience: Integration of Learning Paradigms, w/ an amazing li… 2019-07-20 01:33:12 RT @craigss: Listen to @chelseabfinn talk about building robots that will learn forever - or at least as long as they are turned on.. #Art… 2019-07-03 22:18:25 How should robots learn from humans? While most use rewards or demos, we study how we might have agents learn tasks via interaction with humans. Try for yourself here! https://t.co/27t5QvSamp w/ Mark Woodward, @hausman_k Arxiv version: https://t.co/o8kjAGm8c7 2019-06-27 07:10:49 RT @OriolVinyalsML: Reminder that applications to @Khipu_AI close this Friday. EVERYONE can / should apply! (speakers @chelseabfinn @kchony… 2019-06-19 04:12:42 Can the compositionality of language help RL agents solve long-horizon tasks? We develop an *open-source* CLEVR-like RL env, evaluate long-horizon reasoning + systematic generalization in RL Language as an Abstraction for HRL https://t.co/de5O9ILgcS w @yidingjiang S Gu, K Murphy https://t.co/g3qv3OwvV5 2001-01-01 01:01:01

Découvrez Les IA Experts

Nando de Freitas Chercheur chez Deepind
Nige Willson Conférencier
Ria Pratyusha Kalluri Chercheur, MIT
Ifeoma Ozoma Directrice, Earthseed
Will Knight Journaliste, Wired