Tim's Arxiv FrontPage


Generated on 2024-05-06.


This frontpage is generated by scraping new papers on Arxiv and using an embedding model to find papers matching topics I'm interested in. Currently, the false positive rate is fairly high. The repo is here. Forked and customized from this project


Artificial General Intelligence

2024-05-03

The Cambridge RoboMaster: An Agile Multi-Robot Research Platform

Compact robotic platforms with powerful compute and actuation capabilities are key enablers for practical, real-world deployments of multi-agent research. 0.831This article introduces a tightly integrated hardware, control, and simulation software stack on a fleet of holonomic ground robot platforms designed with this motivation.Our robots, a fleet of customised DJI Robomaster S1 vehicles, offer a balance between small robots that do not possess sufficient compute or actuation capabilities and larger robots that are unsuitable for indoor multi-robot tests.They run a modular ROS2-based optimal estimation and control stack for full onboard autonomy, contain ad-hoc peer-to-peer communication infrastructure, and can zero-shot run multi-agent reinforcement learning (MARL) policies trained in our vectorized multi-agent simulation framework.We present an in-depth review of other platforms currently available, showcase new experimental validation of our system's capabilities, and introduce case studies that highlight the versatility and reliabilty of our system as a testbed for a wide range of research demonstrations.Our system as well as supplementary material is available online: https://proroklab.github.io/cambridge-robomaster

link

2024-05-03

Adversarial Botometer: Adversarial Analysis for Social Bot Detection

Social bots play a significant role in many online social networks (OSN) as they imitate human behavior.This fact raises difficult questions about their capabilities and potential risks.Given the recent advances in Generative AI (GenAI), social bots are capable of producing highly realistic and complex content that mimics human creativity. 0.849As the malicious social bots emerge to deceive people with their unrealistic content, identifying them and distinguishing the content they produce has become an actual challenge for numerous social platforms.Several approaches to this problem have already been proposed in the literature, but the proposed solutions have not been widely evaluated.To address this issue, we evaluate the behavior of a text-based bot detector in a competitive environment where some scenarios are proposed: \textit{First}, the tug-of-war between a bot and a bot detector is examined.It is interesting to analyze which party is more likely to prevail and which circumstances influence these expectations.In this regard, we model the problem as a synthetic adversarial game in which a conversational bot and a bot detector are engaged in strategic online interactions. 0.827\textit{Second}, the bot detection model is evaluated under attack examples generated by a social bot; to this end, we poison the dataset with attack examples and evaluate the model performance under this condition.\textit{Finally}, to investigate the impact of the dataset, a cross-domain analysis is performed.Through our comprehensive evaluation of different categories of social bots using two benchmark datasets, we were able to demonstrate some achivement that could be utilized in future works.

link

2024-05-03

Comparative Analysis of Retrieval Systems in the Real World

This research paper presents a comprehensive analysis of integrating advanced language models with search and retrieval systems in the fields of information retrieval and natural language processing. 0.825The objective is to evaluate and compare various state-of-the-art methods based on their performance in terms of accuracy and efficiency.The analysis explores different combinations of technologies, including Azure Cognitive Search Retriever with GPT-4, Pinecone's Canopy framework, Langchain with Pinecone and different language models (OpenAI, Cohere), LlamaIndex with Weaviate Vector Store's hybrid search, Google's RAG implementation on Cloud VertexAI-Search, Amazon SageMaker's RAG, and a novel approach called KG-FID Retrieval.The motivation for this analysis arises from the increasing demand for robust and responsive question-answering systems in various domains. 0.823The RobustQA metric is used to evaluate the performance of these systems under diverse paraphrasing of questions.The report aims to provide insights into the strengths and weaknesses of each method, facilitating informed decisions in the deployment and development of AI-driven search and retrieval systems.

link

2024-05-03

Single and Multi-Hop Question-Answering Datasets for Reticular Chemistry with GPT-4-Turbo

The rapid advancement in artificial intelligence and natural language processing has led to the development of large-scale datasets aimed at benchmarking the performance of machine learning models. 0.841Herein, we introduce 'RetChemQA,' a comprehensive benchmark dataset designed to evaluate the capabilities of such models in the domain of reticular chemistry.This dataset includes both single-hop and multi-hop question-answer pairs, encompassing approximately 45,000 Q&As for each type.The questions have been extracted from an extensive corpus of literature containing about 2,530 research papers from publishers including NAS, ACS, RSC, Elsevier, and Nature Publishing Group, among others.The dataset has been generated using OpenAI's GPT-4 Turbo, a cutting-edge model known for its exceptional language understanding and generation capabilities.In addition to the Q&A dataset, we also release a dataset of synthesis conditions extracted from the corpus of literature used in this study.The aim of RetChemQA is to provide a robust platform for the development and evaluation of advanced machine learning algorithms, particularly for the reticular chemistry community.The dataset is structured to reflect the complexities and nuances of real-world scientific discourse, thereby enabling nuanced performance assessments across a variety of tasks.The dataset is available at the following link: https://github.com/nakulrampal/RetChemQA

link

2024-05-03

Learning from Evolution: Improving Collective Decision-Making Mechanisms using Insights from Evolutionary Robotics

Collective decision-making enables multi-robot systems to act autonomously in real-world environments.Existing collective decision-making mechanisms suffer from the so-called speed versus accuracy trade-off or rely on high complexity, e.g., by including global communication.Recent work has shown that more efficient collective decision-making mechanisms based on artificial neural networks can be generated using methods from evolutionary computation. 0.836A major drawback of these decision-making neural networks is their limited interpretability.Analyzing evolved decision-making mechanisms can help us improve the efficiency of hand-coded decision-making mechanisms while maintaining a higher interpretability.In this paper, we analyze evolved collective decision-making mechanisms in detail and hand-code two new decision-making mechanisms based on the insights gained.In benchmark experiments, we show that the newly implemented collective decision-making mechanisms are more efficient than the state-of-the-art collective decision-making mechanisms voter model and majority rule.

link

2024-05-03

Mapping the Unseen: Unified Promptable Panoptic Mapping with Dynamic Labeling using Foundation Models

In the field of robotics and computer vision, efficient and accurate semantic mapping remains a significant challenge due to the growing demand for intelligent machines that can comprehend and interact with complex environments. 0.832Conventional panoptic mapping methods, however, are limited by predefined semantic classes, thus making them ineffective for handling novel or unforeseen objects.In response to this limitation, we introduce the Unified Promptable Panoptic Mapping (UPPM) method.UPPM utilizes recent advances in foundation models to enable real-time, on-demand label generation using natural language prompts.By incorporating a dynamic labeling strategy into traditional panoptic mapping techniques, UPPM provides significant improvements in adaptability and versatility while maintaining high performance levels in map reconstruction.We demonstrate our approach on real-world and simulated datasets.Results show that UPPM can accurately reconstruct scenes and segment objects while generating rich semantic labels through natural language interactions.A series of ablation experiments validated the advantages of foundation model-based labeling over fixed label sets.

link

Collective Intelligence

2024-05-03

Learning from Evolution: Improving Collective Decision-Making Mechanisms using Insights from Evolutionary Robotics

Collective decision-making enables multi-robot systems to act autonomously in real-world environments. 0.826Existing collective decision-making mechanisms suffer from the so-called speed versus accuracy trade-off or rely on high complexity, e.g., by including global communication.Recent work has shown that more efficient collective decision-making mechanisms based on artificial neural networks can be generated using methods from evolutionary computation.A major drawback of these decision-making neural networks is their limited interpretability.Analyzing evolved decision-making mechanisms can help us improve the efficiency of hand-coded decision-making mechanisms while maintaining a higher interpretability.In this paper, we analyze evolved collective decision-making mechanisms in detail and hand-code two new decision-making mechanisms based on the insights gained.In benchmark experiments, we show that the newly implemented collective decision-making mechanisms are more efficient than the state-of-the-art collective decision-making mechanisms voter model and majority rule.

link

Complex Systems

2024-05-03

Multitask Extension of Geometrically Aligned Transfer Encoder

Molecular datasets often suffer from a lack of data.It is well-known that gathering data is difficult due to the complexity of experimentation or simulation involved. 0.824Here, we leverage mutual information across different tasks in molecular data to address this issue.We extend an algorithm that utilizes the geometric characteristics of the encoding space, known as the Geometrically Aligned Transfer Encoder (GATE), to a multi-task setup.Thus, we connect multiple molecular tasks by aligning the curved coordinates onto locally flat coordinates, ensuring the flow of information from source tasks to support performance on target data.

link

2024-05-03

Geometric Fabrics: a Safe Guiding Medium for Policy Learning

Robotics policies are always subjected to complex, second order dynamics that entangle their actions with resulting states. 0.828In reinforcement learning (RL) contexts, policies have the burden of deciphering these complicated interactions over massive amounts of experience and complex reward functions to learn how to accomplish tasks.Moreover, policies typically issue actions directly to controllers like Operational Space Control (OSC) or joint PD control, which induces straightline motion towards these action targets in task or joint space.However, straightline motion in these spaces for the most part do not capture the rich, nonlinear behavior our robots need to exhibit, shifting the burden of discovering these behaviors more completely to the agent.Unlike these simpler controllers, geometric fabrics capture a much richer and desirable set of behaviors via artificial, second order dynamics grounded in nonlinear geometry.These artificial dynamics shift the uncontrolled dynamics of a robot via an appropriate control law to form behavioral dynamics.Behavioral dynamics unlock a new action space and safe, guiding behavior over which RL policies are trained.Behavioral dynamics enable bang-bang-like RL policy actions that are still safe for real robots, simplify reward engineering, and help sequence real-world, high-performance policies.We describe the framework more generally and create a specific instantiation for the problem of dexterous, in-hand reorientation of a cube by a highly actuated robot hand.

link

Decision Making Under Uncertainty

2024-05-03

A comparative study of conformal prediction methods for valid uncertainty quantification in machine learning

In the past decades, most work in the area of data analysis and machine learning was focused on optimizing predictive models and getting better results than what was possible with existing models.To what extent the metrics with which such improvements were measured were accurately capturing the intended goal, whether the numerical differences in the resulting values were significant, or whether uncertainty played a role in this study and if it should have been taken into account, was of secondary importance.Whereas probability theory, be it frequentist or Bayesian, used to be the gold standard in science before the advent of the supercomputer, it was quickly replaced in favor of black box models and sheer computing power because of their ability to handle large data sets.This evolution sadly happened at the expense of interpretability and trustworthiness.However, while people are still trying to improve the predictive power of their models, the community is starting to realize that for many applications it is not so much the exact prediction that is of importance, but rather the variability or uncertainty. 0.823The work in this dissertation tries to further the quest for a world where everyone is aware of uncertainty, of how important it is and how to embrace it instead of fearing it.A specific, though general, framework that allows anyone to obtain accurate uncertainty estimates is singled out and analysed.Certain aspects and applications of the framework -- dubbed `conformal prediction' -- are studied in detail.Whereas many approaches to uncertainty quantification make strong assumptions about the data, conformal prediction is, at the time of writing, the only framework that deserves the title `distribution-free'.No parametric assumptions have to be made and the nonparametric results also hold without having to resort to the law of large numbers in the asymptotic regime.

link

2024-05-03

Learning from Evolution: Improving Collective Decision-Making Mechanisms using Insights from Evolutionary Robotics

Collective decision-making enables multi-robot systems to act autonomously in real-world environments.Existing collective decision-making mechanisms suffer from the so-called speed versus accuracy trade-off or rely on high complexity, e.g., by including global communication. 0.822Recent work has shown that more efficient collective decision-making mechanisms based on artificial neural networks can be generated using methods from evolutionary computation.A major drawback of these decision-making neural networks is their limited interpretability.Analyzing evolved decision-making mechanisms can help us improve the efficiency of hand-coded decision-making mechanisms while maintaining a higher interpretability.In this paper, we analyze evolved collective decision-making mechanisms in detail and hand-code two new decision-making mechanisms based on the insights gained.In benchmark experiments, we show that the newly implemented collective decision-making mechanisms are more efficient than the state-of-the-art collective decision-making mechanisms voter model and majority rule.

link

Neural Ordinary Differential Equations

2024-05-03

An analysis and solution of ill-conditioning in physics-informed neural networks

Physics-informed neural networks (PINNs) have recently emerged as a novel and popular approach for solving forward and inverse problems involving partial differential equations (PDEs). 0.835However, achieving stable training and obtaining correct results remain a challenge in many cases, often attributed to the ill-conditioning of PINNs.Nonetheless, further analysis is still lacking, severely limiting the progress and applications of PINNs in complex engineering problems.Drawing inspiration from the ill-conditioning analysis in traditional numerical methods, we establish a connection between the ill-conditioning of PINNs and the ill-conditioning of the Jacobian matrix of the PDE system.Specifically, for any given PDE system, we construct its controlled system.This controlled system allows for adjustment of the condition number of the Jacobian matrix while retaining the same solution as the original system.Our numerical findings suggest that the ill-conditioning observed in PINNs predominantly stems from that of the Jacobian matrix.As the condition number of the Jacobian matrix decreases, the controlled systems exhibit faster convergence rates and higher accuracy.Building upon this understanding and the natural extension of controlled systems, we present a general approach to mitigate the ill-conditioning of PINNs, leading to successful simulations of the three-dimensional flow around the M6 wing at a Reynolds number of 5,000.To the best of our knowledge, this is the first time that PINNs have been successful in simulating such complex systems, offering a promising new technique for addressing industrial complexity problems.Our findings also offer valuable insights guiding the future development of PINNs.

link

2024-05-03

Parameter estimation in ODEs: assessing the potential of local and global solvers

We consider the problem of parameter estimation in dynamic systems described by ordinary differential equations. 0.837A review of the existing literature emphasizes the need for deterministic global optimization methods due to the nonconvex nature of these problems.Recent works have focused on expanding the capabilities of specialized deterministic global optimization algorithms to handle more complex problems.Despite advancements, current deterministic methods are limited to problems with a maximum of around five state and five decision variables, prompting ongoing efforts to enhance their applicability to practical problems. Our study seeks to assess the effectiveness of state-of-the-art general-purpose global and local solvers in handling realistic-sized problems efficiently, and evaluating their capabilities to cope with the nonconvex nature of the underlying estimation problems.

link

Reinforcement Learning

2024-05-03

Learning Optimal Deterministic Policies with Stochastic Policy Gradients

Policy gradient (PG) methods are successful approaches to deal with continuous reinforcement learning (RL) problems. 0.83They learn stochastic parametric (hyper)policies by either exploring in the space of actions or in the space of parameters.Stochastic controllers, however, are often undesirable from a practical perspective because of their lack of robustness, safety, and traceability.In common practice, stochastic (hyper)policies are learned only to deploy their deterministic version.In this paper, we make a step towards the theoretical understanding of this practice.After introducing a novel framework for modeling this scenario, we study the global convergence to the best deterministic policy, under (weak) gradient domination assumptions.Then, we illustrate how to tune the exploration level used for learning to optimize the trade-off between the sample complexity and the performance of the deployed deterministic policy.Finally, we quantitatively compare action-based and parameter-based exploration, giving a formal guise to intuitive results.

link

2024-05-03

Multi-Objective Recommendation via Multivariate Policy Learning

Real-world recommender systems often need to balance multiple objectives when deciding which recommendations to present to users.These include behavioural signals (e.g. clicks, shares, dwell time), as well as broader objectives (e.g. diversity, fairness).Scalarisation methods are commonly used to handle this balancing task, where a weighted average of per-objective reward signals determines the final score used for ranking.Naturally, how these weights are computed exactly, is key to success for any online platform.We frame this as a decision-making task, where the scalarisation weights are actions taken to maximise an overall North Star reward (e.g. long-term user retention or growth).We extend existing policy learning methods to the continuous multivariate action domain, proposing to maximise a pessimistic lower bound on the North Star reward that the learnt policy will yield. 0.822Typical lower bounds based on normal approximations suffer from insufficient coverage, and we propose an efficient and effective policy-dependent correction for this.We provide guidance to design stochastic data collection policies, as well as highly sensitive reward signals.Empirical observations from simulations, offline and online experiments highlight the efficacy of our deployed approach.

link

2024-05-03

Model-based reinforcement learning for protein backbone design

Designing protein nanomaterials of predefined shape and characteristics has the potential to dramatically impact the medical industry.Machine learning (ML) has proven successful in protein design, reducing the need for expensive wet lab experiment rounds.However, challenges persist in efficiently exploring the protein fitness landscapes to identify optimal protein designs.In response, we propose the use of AlphaZero to generate protein backbones, meeting shape and structural scoring requirements.We extend an existing Monte Carlo tree search (MCTS) framework by incorporating a novel threshold-based reward and secondary objectives to improve design precision.This innovation considerably outperforms existing approaches, leading to protein backbones that better respect structural scores.The application of AlphaZero is novel in the context of protein backbone design and demonstrates promising performance.AlphaZero consistently surpasses baseline MCTS by more than 100% in top-down protein design tasks.Additionally, our application of AlphaZero with secondary objectives uncovers further promising outcomes, indicating the potential of model-based reinforcement learning (RL) in navigating the intricate and nuanced aspects of protein design 0.825

link

2024-05-03

Zero-Sum Positional Differential Games as a Framework for Robust Reinforcement Learning: Deep Q-Learning Approach

Robust Reinforcement Learning (RRL) is a promising Reinforcement Learning (RL) paradigm aimed at training robust to uncertainty or disturbances models, making them more efficient for real-world applications. 0.873Following this paradigm, uncertainty or disturbances are interpreted as actions of a second adversarial agent, and thus, the problem is reduced to seeking the agents' policies robust to any opponent's actions.This paper is the first to propose considering the RRL problems within the positional differential game theory, which helps us to obtain theoretically justified intuition to develop a centralized Q-learning approach. 0.824Namely, we prove that under Isaacs's condition (sufficiently general for real-world dynamical systems), the same Q-function can be utilized as an approximate solution of both minimax and maximin Bellman equations.Based on these results, we present the Isaacs Deep Q-Network algorithms and demonstrate their superiority compared to other baseline RRL and Multi-Agent RL algorithms in various environments.

link

2024-05-03

Simulating the economic impact of rationality through reinforcement learning and agent-based modelling

Agent-based models (ABMs) are simulation models used in economics to overcome some of the limitations of traditional frameworks based on general equilibrium assumptions.However, agents within an ABM follow predetermined, not fully rational, behavioural rules which can be cumbersome to design and difficult to justify.Here we leverage multi-agent reinforcement learning (RL) to expand the capabilities of ABMs with the introduction of fully rational agents that learn their policy by interacting with the environment and maximising a reward function. 0.865Specifically, we propose a 'Rational macro ABM' (R-MABM) framework by extending a paradigmatic macro ABM from the economic literature.We show that gradually substituting ABM firms in the model with RL agents, trained to maximise profits, allows for a thorough study of the impact of rationality on the economy.We find that RL agents spontaneously learn three distinct strategies for maximising profits, with the optimal strategy depending on the level of market competition and rationality. 0.833We also find that RL agents with independent policies, and without the ability to communicate with each other, spontaneously learn to segregate into different strategic groups, thus increasing market power and overall profits.Finally, we find that a higher degree of rationality in the economy always improves the macroeconomic environment as measured by total output, depending on the specific rational policy, this can come at the cost of higher instability.Our R-MABM framework is general, it allows for stable multi-agent learning, and represents a principled and robust direction to extend existing economic simulators.

link

2024-05-03

Towards Improving Learning from Demonstration Algorithms via MCMC Methods

Behavioral cloning, or more broadly, learning from demonstrations (LfD) is a priomising direction for robot policy learning in complex scenarios. 0.823Albeit being straightforward to implement and data-efficient, behavioral cloning has its own drawbacks, limiting its efficacy in real robot setups.In this work, we take one step towards improving learning from demonstration algorithms by leveraging implicit energy-based policy models.Results suggest that in selected complex robot policy learning scenarios, treating supervised policy learning with an implicit model generally performs better, on average, than commonly used neural network-based explicit models, especially in the cases of approximating potentially discontinuous and multimodal functions.

link

2024-05-03

Geometric Fabrics: a Safe Guiding Medium for Policy Learning

Robotics policies are always subjected to complex, second order dynamics that entangle their actions with resulting states.In reinforcement learning (RL) contexts, policies have the burden of deciphering these complicated interactions over massive amounts of experience and complex reward functions to learn how to accomplish tasks. 0.866Moreover, policies typically issue actions directly to controllers like Operational Space Control (OSC) or joint PD control, which induces straightline motion towards these action targets in task or joint space.However, straightline motion in these spaces for the most part do not capture the rich, nonlinear behavior our robots need to exhibit, shifting the burden of discovering these behaviors more completely to the agent.Unlike these simpler controllers, geometric fabrics capture a much richer and desirable set of behaviors via artificial, second order dynamics grounded in nonlinear geometry.These artificial dynamics shift the uncontrolled dynamics of a robot via an appropriate control law to form behavioral dynamics.Behavioral dynamics unlock a new action space and safe, guiding behavior over which RL policies are trained. 0.82Behavioral dynamics enable bang-bang-like RL policy actions that are still safe for real robots, simplify reward engineering, and help sequence real-world, high-performance policies.We describe the framework more generally and create a specific instantiation for the problem of dexterous, in-hand reorientation of a cube by a highly actuated robot hand.

link

Trajectory Optimization

2024-05-03

Characterized Diffusion and Spatial-Temporal Interaction Network for Trajectory Prediction in Autonomous Driving

Trajectory prediction is a cornerstone in autonomous driving (AD), playing a critical role in enabling vehicles to navigate safely and efficiently in dynamic environments. 0.856To address this task, this paper presents a novel trajectory prediction model tailored for accuracy in the face of heterogeneous and uncertain traffic scenarios. 0.822At the heart of this model lies the Characterized Diffusion Module, an innovative module designed to simulate traffic scenarios with inherent uncertainty.This module enriches the predictive process by infusing it with detailed semantic information, thereby enhancing trajectory prediction accuracy.Complementing this, our Spatio-Temporal (ST) Interaction Module captures the nuanced effects of traffic scenarios on vehicle dynamics across both spatial and temporal dimensions with remarkable effectiveness.Demonstrated through exhaustive evaluations, our model sets a new standard in trajectory prediction, achieving state-of-the-art (SOTA) results on the Next Generation Simulation (NGSIM), Highway Drone (HighD), and Macao Connected Autonomous Driving (MoCAD) datasets across both short and extended temporal spans.This performance underscores the model's unparalleled adaptability and efficacy in navigating complex traffic scenarios, including highways, urban streets, and intersections.

link