Stay ahead of the generative AI revolution!Join the M5B Newsletter →

Welcome to M5BMachine 5-Minute Briefing

Your centralized dashboard for the generative AI revolution. Track the latest models, secure exclusive offers, and master the prompt.

Research• Jan 6, 2026

Horizon Reduction as Information Loss in Offline Reinforcement Learning

arXiv:2601.00831v1 Announce Type: new Abstract: Horizon reduction is a common design strategy in offline reinforcement learning (RL), used to mitigate long-horizon credit assignment, improve stability, and enable scalable learning through truncated rollouts, windowed training, or hierarchical decom...

#ArXiv#Machine Learning#Academic

Research• Jan 6, 2026

MathLedger: A Verifiable Learning Substrate with Ledger-Attested Feedback

arXiv:2601.00816v1 Announce Type: new Abstract: Contemporary AI systems achieve extraordinary performance yet remain opaque and non-verifiable, creating a crisis of trust for safety-critical deployment. We introduce MathLedger, a substrate for verifiable machine cognition that integrates formal ver...

#ArXiv#Machine Learning#Academic

Research• Jan 6, 2026

Semantic Alignment of Multilingual Knowledge Graphs via Contextualized Vector Projections

arXiv:2601.00814v1 Announce Type: new Abstract: The paper presents our work on cross-lingual ontology alignment system which uses embedding based cosine similarity matching. The ontology entities are made contextually richer by creating descriptions using novel techniques. We use a fine-tuned trans...

#ArXiv#Machine Learning#Academic

Research• Jan 5, 2026

Evaluating Anomaly Detectors for Simulated Highly Imbalanced Industrial Classification Problems

arXiv:2601.00005v1 Announce Type: new Abstract: Machine learning offers potential solutions to current issues in industrial systems in areas such as quality control and predictive maintenance, but also faces unique barriers in industrial applications. An ongoing challenge is extreme class imbalance...

#ArXiv#Machine Learning#Academic

Research• Jan 5, 2026

Yahtzee: Reinforcement Learning Techniques for Stochastic Combinatorial Games

arXiv:2601.00007v1 Announce Type: new Abstract: Yahtzee is a classic dice game with a stochastic, combinatorial structure and delayed rewards, making it an interesting mid-scale RL benchmark. While an optimal policy for solitaire Yahtzee can be computed using dynamic programming methods, multiplaye...

#ArXiv#Machine Learning#Academic

Advertisement

Research• Jan 5, 2026

IMBWatch -- a Spatio-Temporal Graph Neural Network approach to detect Illicit Massage Business

arXiv:2601.00075v1 Announce Type: new Abstract: Illicit Massage Businesses (IMBs) are a covert and persistent form of organized exploitation that operate under the facade of legitimate wellness services while facilitating human trafficking, sexual exploitation, and coerced labor. Detecting IMBs is ...

#ArXiv#Machine Learning#Academic

Research• Jan 5, 2026

Finetuning Large Language Models for Automated Depression Screening in Nigerian Pidgin English: GENSCORE Pilot Study

arXiv:2601.00004v1 Announce Type: new Abstract: Depression is a major contributor to the mental-health burden in Nigeria, yet screening coverage remains limited due to low access to clinicians, stigma, and language barriers. Traditional tools like the Patient Health Questionnaire-9 (PHQ-9) were val...

#ArXiv#Machine Learning#Academic

Research• Jan 1, 2026

A Comprehensive Study of Deep Learning Model Fixing Approaches

arXiv:2512.23745v1 Announce Type: new Abstract: Deep Learning (DL) has been widely adopted in diverse industrial domains, including autonomous driving, intelligent healthcare, and aided programming. Like traditional software, DL systems are also prone to faults, whose malfunctioning may expose user...

#ArXiv#Machine Learning#Academic

Research• Jan 1, 2026

Network Traffic Analysis with Process Mining: The UPSIDE Case Study

arXiv:2512.23718v1 Announce Type: new Abstract: Online gaming is a popular activity involving the adoption of complex systems and network infrastructures. The relevance of gaming, which generates large amounts of market revenue, drove research in modeling network devices' behavior to evaluate bandw...

#ArXiv#Machine Learning#Academic

Research• Dec 29, 2025

A Survey of Freshness-Aware Wireless Networking with Reinforcement Learning

arXiv:2512.21412v1 Announce Type: new Abstract: The age of information (AoI) has become a central measure of data freshness in modern wireless systems, yet existing surveys either focus on classical AoI formulations or provide broad discussions of reinforcement learning (RL) in wireless networks wi...

#ArXiv#Machine Learning#Academic

Advertisement

Research• Dec 29, 2025

Three-way decision with incomplete information based on similarity and satisfiability

arXiv:2512.21421v1 Announce Type: new Abstract: Three-way decision is widely applied with rough set theory to learn classification or decision rules. The approaches dealing with complete information are well established in the literature, including the two complementary computational and conceptual...

#ArXiv#Machine Learning#Academic

Research• Dec 29, 2025

A Study of Solving Life-and-Death Problems in Go Using Relevance-Zone Based Solvers

arXiv:2512.21365v1 Announce Type: new Abstract: This paper analyzes the behavior of solving Life-and-Death (L&D) problems in the game of Go using current state-of-the-art computer Go solvers with two techniques: the Relevance-Zone Based Search (RZS) and the relevance-zone pattern table. We examined...

#ArXiv#Machine Learning#Academic

Research• Dec 25, 2025

Proceedings of the 20th International Conference on Knowledge, Information and Creativity Support Systems (KICSS 2025)

arXiv:2512.20628v1 Announce Type: new Abstract: This volume presents the proceedings of the 20th International Conference on Knowledge, Information and Creativity Support Systems (KICSS 2025), held in Nagaoka, Japan, on December 3-5, 2025. The conference, organized in cooperation with the IEICE Pro...

#ArXiv#Machine Learning#Academic

Research• Dec 22, 2025

Dion2: A Simple Method to Shrink Matrix in Muon

arXiv:2512.16928v1 Announce Type: new Abstract: The Muon optimizer enjoys strong empirical performance and theoretical grounding. However, the super-linear cost of its orthonormalization step introduces increasing overhead with scale. To alleviate this cost, several works have attempted to reduce t...

#ArXiv#Machine Learning#Academic

Research• Dec 19, 2025

SHARe-KAN: Holographic Vector Quantization for Memory-Bound Inference

arXiv:2512.15742v1 Announce Type: new Abstract: Kolmogorov-Arnold Networks (KANs) face a fundamental memory wall: their learned basis functions create parameter counts that impose extreme bandwidth demands, hindering deployment in memory-constrained environments. We show that Vision KANs exhibit a ...

#ArXiv#Machine Learning#Academic

Advertisement

Research• Dec 18, 2025

SepsisSuite: Beyond Risk Stratification -- A Comparative Analysis of Deep Fusion vs. Expert Stacking for Prescriptive Sepsis AI

arXiv:2512.14712v1 Announce Type: new Abstract: Sepsis accounts for nearly 20% of global ICU admissions, yet conventional prediction models often fail to effectively integrate heterogeneous data streams, remaining either siloed by modality or reliant on brittle early fusion. In this work, we presen...

#ArXiv#Machine Learning#Academic

Research• Dec 18, 2025

Improving Underwater Acoustic Classification Through Learnable Gabor Filter Convolution and Attention Mechanisms

arXiv:2512.14714v1 Announce Type: new Abstract: Remotely detecting and classifying underwater acoustic targets is critical for environmental monitoring and defence. However, the complex nature of ship-radiated and environmental underwater noise poses significant challenges to accurate signal proces...

#ArXiv#Machine Learning#Academic

Research• Dec 12, 2025

Robust Gradient Descent via Heavy-Ball Momentum with Predictive Extrapolation

arXiv:2512.10033v1 Announce Type: new Abstract: Accelerated gradient methods like Nesterov's Accelerated Gradient (NAG) achieve faster convergence on well-conditioned problems but often diverge on ill-conditioned or non-convex landscapes due to aggressive momentum accumulation. We propose Heavy-Bal...

#ArXiv#Machine Learning#Academic

Research• Dec 12, 2025

HGC-Herd: Efficient Heterogeneous Graph Condensation via Representative Node Herding

arXiv:2512.09947v1 Announce Type: new Abstract: Heterogeneous graph neural networks (HGNNs) have demonstrated strong capability in modeling complex semantics across multi-type nodes and relations. However, their scalability to large-scale graphs remains challenging due to structural redundancy and ...

#ArXiv#Machine Learning#Academic

Research• Dec 12, 2025

BAMBO: Construct Ability and Efficiency LLM Pareto Set via Bayesian Adaptive Multi-objective Block-wise Optimization

arXiv:2512.09972v1 Announce Type: new Abstract: Constructing a Pareto set is pivotal for navigating the capability-efficiency trade-offs in Large Language Models (LLMs); however, existing merging techniques remain inadequate for this task. Coarse-grained, model-level methods yield only a sparse set...

#ArXiv#Machine Learning#Academic

Advertisement

Research• Nov 1, 2025

RL without TD learning

In this post, I’ll introduce a reinforcement learning (RL) algorithm based on an “alternative” paradigm: divide and conquer. Unlike traditional methods, this algorithm is not based on temporal difference (TD) learning (which has scalability challenges), and scales well to long-horizon tasks. We ...

#Berkeley#BAIR#Academic

Research• Sep 1, 2025

What exactly does word2vec learn?

What exactly does word2vec learn, and how? Answering this question amounts to understanding representation learning in a minimal yet interesting language modeling task. Despite the fact that word2vec is a well-known precursor to modern language models, for many years, researchers lacked a quantitati...

#Berkeley#BAIR#Academic

Research• Jul 1, 2025

Whole-Body Conditioned Egocentric Video Prediction

× Predicting Ego-centric Video from human Actions (PEVA). Given past video frames and an action specifying a desired change in 3D pose, PEVA predicts the next video frame. Our results show that, given the first frame and a sequence of actions, our model can generate videos of...

#Berkeley#BAIR#Academic

Research• Apr 11, 2025

Defending against Prompt Injection with Structured Queries (StruQ) and Preference Optimization (SecAlign)

Recent advances in Large Language Models (LLMs) enable exciting LLM-integrated applications. However, as LLMs have improved, so have the attacks against them. Prompt injection attack is listed as the #1 threat by OWASP to LLM-integrated applications, where an LLM input contains a trusted prompt (ins...

#Berkeley#BAIR#Academic