TruthArchive.ai - Tweets Saved By @DynamicWebPaige

Saved - September 18, 2023 at 6:59 AM
reSee.it AI Summary
DeepMind's AI team presented their groundbreaking work on reinventing software development and creative technical work using RL and generative models. They highlighted the immense potential for creators in this field. Their research and applied work, including large sequence models, understanding HTML, natural language to code generation, and enhanced code completion, can be explored. Additionally, smaller models powered by PaLM 2 (soon to be Gemini) are available for developers to try out. Exciting product integrations are expected in the coming months. Stay tuned!

@DynamicWebPaige - 👩‍💻 Paige Bailey

✨👩‍💻 Our @DeepMind Code AI team delivered a presentation this morning about the work we've done internally and externally—and the path for reinventing what it means to do software development and creative technical work in the age of generative models. 🤖 RL and generative models combined have massive potential for creators: and there has never been a more capable group to build and implement this vision. ✨ Very excited for the years to come! 🙌🏻 If you're curious about what we've been working on, a small fraction of our research and applied work can be found in the links below: • Large sequence models for software development activities https://ai.googleblog.com/2023/05/large-sequence-models-for-software.html • Understanding HTML with Large Language Models https://arxiv.org/abs/2210.03945 • Natural Language to Code Generation in Interactive Data Science Notebooks https://arxiv.org/abs/2212.09248 • ML-Enhanced Code Completion Improves Developer Productivity https://ai.googleblog.com/2022/07/ml-enhanced-code-completion-improves.html • Code as Policies: Language Model Programs for Embodied Control https://code-as-policies.github.io • Learning Performance-Improving Code Edits https://arxiv.org/abs/2302.07867 • Generative Agents: Interactive Simulacra of Human Behavior https://arxiv.org/abs/2304.03442 • AlphaDev discovers faster sorting algorithms https://deepmind.com/blog/alphadev-discovers-faster-sorting-algorithms • Competitive programming with AlphaCode https://deepmind.com/blog/competitive-programming-with-alphacode • Baldur: Whole-Proof Generation and Repair with Large Language Models https://arxiv.org/abs/2303.04910 ...and more.

Large sequence models for software development activities blog.research.google
Understanding HTML with Large Language Models Large language models (LLMs) have shown exceptional performance on a variety of natural language tasks. Yet, their capabilities for HTML understanding -- i.e., parsing the raw HTML of a webpage, with applications to automation of web-based tasks, crawling, and browser-assisted retrieval -- have not been fully explored. We contribute HTML understanding models (fine-tuned LLMs) and an in-depth analysis of their capabilities under three tasks: (i) Semantic Classification of HTML elements, (ii) Description Generation for HTML inputs, and (iii) Autonomous Web Navigation of HTML pages. While previous work has developed dedicated architectures and training procedures for HTML understanding, we show that LLMs pretrained on standard natural language corpora transfer remarkably well to HTML understanding tasks. For instance, fine-tuned LLMs are 12% more accurate at semantic classification compared to models trained exclusively on the task dataset. Moreover, when fine-tuned on data from the MiniWoB benchmark, LLMs successfully complete 50% more tasks using 192x less data compared to the previous best supervised model. Out of the LLMs we evaluate, we show evidence that T5-based models are ideal due to their bidirectional encoder-decoder architecture. To promote further research on LLMs for HTML understanding, we create and open-source a large-scale HTML dataset distilled and auto-labeled from CommonCrawl. arxiv.org
Natural Language to Code Generation in Interactive Data Science Notebooks Computational notebooks, such as Jupyter notebooks, are interactive computing environments that are ubiquitous among data scientists to perform data wrangling and analytic tasks. To measure the performance of AI pair programmers that automatically synthesize programs for those tasks given natural language (NL) intents from users, we build ARCADE, a benchmark of 1082 code generation problems using the pandas data analysis framework in data science notebooks. ARCADE features multiple rounds of NL-to-code problems from the same notebook. It requires a model to understand rich multi-modal contexts, such as existing notebook cells and their execution states as well as previous turns of interaction. To establish a strong baseline on this challenging task, we develop PaChiNCo, a 62B code language model (LM) for Python computational notebooks, which significantly outperforms public code LMs. Finally, we explore few-shot prompting strategies to elicit better code with step-by-step decomposition and NL explanation, showing the potential to improve the diversity and explainability of model predictions. arxiv.org
ML-Enhanced Code Completion Improves Developer Productivity blog.research.google
Code as Policies Code as Policies: Language Model Programs for Embodied Control code-as-policies.github.io
Learning Performance-Improving Code Edits The waning of Moore's Law has shifted the focus of the tech industry towards alternative methods for continued performance gains. While optimizing compilers are a standard tool to help increase program efficiency, programmers continue to shoulder much responsibility in crafting and refactoring code with better performance characteristics. In this paper, we investigate the ability of large language models (LLMs) to suggest functionally correct, performance improving code edits. We hypothesize that language models can suggest such edits in ways that would be impractical for static analysis alone. We investigate these questions by curating a large-scale dataset of Performance-Improving Edits, PIE. PIE contains trajectories of programs, where a programmer begins with an initial, slower version and iteratively makes changes to improve the program's performance. We use PIE to evaluate and improve the capacity of large language models. Specifically, use examples from PIE to fine-tune multiple variants of CODEGEN, a billion-scale Transformer-decoder model. Additionally, we use examples from PIE to prompt OpenAI's CODEX using a few-shot prompting. By leveraging PIE, we find that both CODEX and CODEGEN can generate performance-improving edits, with speedups of more than 2.5x for over 25% of the programs, for C++ and Python, even after the C++ programs were compiled using the O3 optimization level. Crucially, we show that PIE allows CODEGEN, an open-sourced and 10x smaller model than CODEX, to match the performance of CODEX on this challenging task. Overall, this work opens new doors for creating systems and methods that can help programmers write efficient code. arxiv.org
Generative Agents: Interactive Simulacra of Human Behavior Believable proxies of human behavior can empower interactive applications ranging from immersive environments to rehearsal spaces for interpersonal communication to prototyping tools. In this paper, we introduce generative agents--computational software agents that simulate believable human behavior. Generative agents wake up, cook breakfast, and head to work; artists paint, while authors write; they form opinions, notice each other, and initiate conversations; they remember and reflect on days past as they plan the next day. To enable generative agents, we describe an architecture that extends a large language model to store a complete record of the agent's experiences using natural language, synthesize those memories over time into higher-level reflections, and retrieve them dynamically to plan behavior. We instantiate generative agents to populate an interactive sandbox environment inspired by The Sims, where end users can interact with a small town of twenty five agents using natural language. In an evaluation, these generative agents produce believable individual and emergent social behaviors: for example, starting with only a single user-specified notion that one agent wants to throw a Valentine's Day party, the agents autonomously spread invitations to the party over the next two days, make new acquaintances, ask each other out on dates to the party, and coordinate to show up for the party together at the right time. We demonstrate through ablation that the components of our agent architecture--observation, planning, and reflection--each contribute critically to the believability of agent behavior. By fusing large language models with computational, interactive agents, this work introduces architectural and interaction patterns for enabling believable simulations of human behavior. arxiv.org
AlphaDev discovers faster sorting algorithms In our paper published today in Nature, we introduce AlphaDev, an artificial intelligence (AI) system that uses reinforcement learning to discover enhanced computer science algorithms – surpassing those honed by scientists and engineers over decades. deepmind.com
Competitive programming with AlphaCode Solving novel problems and setting a new milestone in competitive programming. deepmind.com
Baldur: Whole-Proof Generation and Repair with Large Language Models Formally verifying software properties is a highly desirable but labor-intensive task. Recent work has developed methods to automate formal verification using proof assistants, such as Coq and Isabelle/HOL, e.g., by training a model to predict one proof step at a time, and using that model to search through the space of possible proofs. This paper introduces a new method to automate formal verification: We use large language models, trained on natural language text and code and fine-tuned on proofs, to generate whole proofs for theorems at once, rather than one step at a time. We combine this proof generation model with a fine-tuned repair model to repair generated proofs, further increasing proving power. As its main contributions, this paper demonstrates for the first time that: (1) Whole-proof generation using transformers is possible and is as effective as search-based techniques without requiring costly search. (2) Giving the learned model additional context, such as a prior failed proof attempt and the ensuing error message, results in proof repair and further improves automated proof generation. (3) We establish a new state of the art for fully automated proof synthesis. We reify our method in a prototype, Baldur, and evaluate it on a benchmark of 6,336 Isabelle/HOL theorems and their proofs. In addition to empirically showing the effectiveness of whole-proof generation, repair, and added context, we show that Baldur improves on the state-of-the-art tool, Thor, by automatically generating proofs for an additional 8.7% of the theorems. Together, Baldur and Thor can prove 65.7% of the theorems fully automatically. This paper paves the way for new research into using large language models for automating formal verification. arxiv.org

@DynamicWebPaige - 👩‍💻 Paige Bailey

👉🏻 If you're interested in trying out some of the smaller models (powered by PaLM 2, soon to be Gemini), you can check out: - @GoogleColab: https://blog.google/technology/developers/google-colab-ai-coding-features/ - Duet AI for Developers (which includes security, DevOps, and data analysis features): https://cloud.google.com/blog/products/application-development/introducing-duet-ai-for-developers - @AndroidStudio: https://developer.android.com/studio/preview/studio-bot - @Firebase: https://developers.generativeai.google/tools/firebase_extensions - @GoogleCloud's Codey API: https://cloud.google.com/vertex-ai/docs/generative-ai/code/code-models-overview - Bard, and Magi: https://blog.google/technology/ai/code-with-bard/ 🚀 More product integrations landing in the next several months, stay tuned!

AI-powered coding, free of charge with Colab Colab will soon add AI coding features like code completions, natural language to code generation and even a code-assisting chatbot. blog.google
Google Cloud Duet AI for developers | Google Cloud Blog Explore how Google Cloud’s Duet AI can make developers more productive. cloud.google.com
Meet Studio Bot  |  Android Studio  |  Android Developers Learn how to improve your coding productivity with Studio Bot. developer.android.com
PaLM API Firebase Extensions  |  Generative AI for Developers developers.generativeai.google
Code models overview  |  Vertex AI  |  Google Cloud cloud.google.com
Code and debug with Bard Bard can now help with programming and software development tasks, across more than 20 programming languages. blog.google
View Full Interactive Feed