In Lecture 2 we foreshadowed the need for a different style of semantics that could handle non-terminating programs. In Lecture 3 we started building some infrastructure that could deal with ...
If both choose the same response, it is a draw (there is no winner). The user's inputs are specified by entering a number: 1 for rock, 2 for paper, 3 for scissors. If the user enters a 4, the program ...
Robustly cooperating with unseen agents and human partners presents significantchallenges due to the diverse cooperative conventions these partners may adopt.Existing Ad Hoc Teamwork (AHT) methods ...
Relaxed Exploration Constrained Reinforcement Learning. Shahaf S. Shperberg, Bo Liu, and Peter Stone. @InProceedings{shahaf_shperberg_AAMAS_2024, author = {Shahaf S. Shperberg and Bo Liu and Peter ...
Current approaches to learning cooperative multi-agent behaviors assumerelatively restrictive settings. In standard fully cooperative multi-agentreinforcement learning, the learning algorithm controls ...
One vision of a future artificial intelligence (AI) is where many separate unitscan learn independently over a lifetime and share their knowledge with eachother. The synergy between lifelong learning ...
On Network Appliance filers, there is a feature called snapshots. This feature is enabled at UTCS for files stored on filer4b. Home directories and most research group space lives here. As the name ...
A critical bottleneck limiting imitation learning in robotics is the lack ofdata. This problem is more severe in mobile manipulation, where collectingdemonstrations is harder than in stationary ...
MACTA: A Multi-agent Reinforcement Learning Approach for Cache Timing Attacks and Detection. Jiaxun Cui, Xiaomeng Yang, Mulong Luo, Geunbae Lee, Peter Stone, Hsien-Hsin S. Lee, Benjamin Lee, G. Edward ...
Experience replay (ER) is a crucial component of many deep reinforcement learning (RL) systems. However, uniform sampling from an ER buffer can lead to slow convergence and unstable asymptotic ...
There will be two modalities for this course. We can meet in person or online. We will give you at least one week's notice when we go from virtual to in-person or vice versa. When we meet online the ...
I am a 4th year computer science PhD student at UT Austin, advised by Prof. Swarat Chaudhuri. My research focuses on building machine learning frameworks to generate code with human-like efficiency. I ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results