Page 95 | AI News & Updates | Latest Artificial Intelligence Developments

#advanced (300) #amazon bedrock #amazon nova #amazon sagemaker ai #foundation models #generative ai #technical how-to

Document intelligence evolved: Building and evaluating KIE solutions that scale

In this blog post, we demonstrate an end-to-end approach for building and evaluating a KIE solution using Amazon Nova models available through Amazon Bedrock. This end-to-end approach encompasses three critical phases: data readiness (understanding and preparing your documents), solution development (implementing extraction logic with appropriate models), and performance measurement (evaluating accuracy, efficiency, and cost-effectiveness). We illustrate this comprehensive approach using the FATURA dataset—a collection of diverse invoice documents that serves as a representative proxy for real-world enterprise data.

#amazon sagemaker hyperpod

Announcing the new cluster creation experience for Amazon SageMaker HyperPod

With the new cluster creation experience, you can create your SageMaker HyperPod clusters, including the required prerequisite AWS resources, in one click, with prescriptive default values automatically applied. In this post, we explore the new cluster creation experience for Amazon SageMaker HyperPod.

Creating Slick Data Dashboards with Python, Taipy & Google Sheets

Develop simple yet powerful business intelligence tools tailored to meet your company's specific needs.

#artificial intelligence #data analysis #data science #writing #author spotlights #machine learning #programming

Writing Is Thinking

Egor Howell on breaking into ML without a CS degree, surviving 80+ interviews, and what to do if you feel stuck in your career.
The post Writing Is Thinking appeared first on Towards Data Science.

What is Data Science in Simple Words?

This introductory article discussed what data science is (and isn't), its relationship with math, statistic, and computer science, and why its high importance and impact nowadays... all in simple terms.

3 Ways to Speed Up and Improve Your XGBoost Models

Extreme gradient boosting ( XGBoost ) is one of the most prominent machine learning techniques used not only for experimentation and analysis but also in deployed predictive solutions in industry.

5 Reasons Why Vibe Coding Threatens Secure Data App Development

AI-powered "vibe coding" promises rapid development but creates unprecedented security risks for data applications handling sensitive information.

#culture #culture / digital culture

Spiritual Influencers Say ‘Sentient’ AI Can Help You Solve Life’s Mysteries

As concerns grow about AI chatbots leading users into delusional spirals, prominent spiritual influencers are capitalizing on an emerging form of techno-spirituality.

#business #business / artificial intelligence

Meet the Guys Betting Big on AI Gambling Agents

Online gambling is a massive industry. The AI boom keeps booming. It was only a matter of time before people tried to put them together.

#artificial intelligence #app #the algorithm #subscriber-only stories

Can an AI doppelgänger help me do my job?

Everywhere I look, I see AI clones. On X and LinkedIn, “thought leaders” and influencers offer their followers a chance to ask questions of their digital replicas. OnlyFans creators are having AI models of themselves chat, for a price, with followers. “Virtual human” salespeople in China are reportedly outselling real humans. Digital clones—AI models that…

#artificial intelligence #app

Therapists are secretly using ChatGPT. Clients are triggered.

Declan would never have found out his therapist was using ChatGPT had it not been for a technical mishap. The connection was patchy during one of their online sessions, so Declan suggested they turn off their video feeds. Instead, his therapist began inadvertently sharing his screen. “Suddenly, I was watching him use ChatGPT,” says Declan,…

AI Glasses Go Viral

ChatGPT glasses win, .ai island gold, AI schools, spotting fake AI, and more...

#data science #artificial intelligence #career advice #data engineering #machine learning #specialization

The Generalist: The New All-Around Type of Data Professional?

Is over-specialization ending and are data generalists on the rise?
The post The Generalist: The New All-Around Type of Data Professional? appeared first on Towards Data Science.

#business #business / artificial intelligence

Latam-GPT: The Free, Open Source, and Collaborative AI of Latin America

WIRED talks to the director of the Chilean National Center for Artificial Intelligence about Latam-GPT, the large-language model that aims to address the region’s specific needs and change the current technological dynamic.

#politics #politics / politics news #business #business / artificial intelligence

WIRED Roundup: Meta’s AI Brain Drain

On this episode of Uncanny Valley, we look back at the week's biggest stories—from the researchers leaving Meta's new superintelligence lab, to the dark money group funding Democratic influencers.

What exactly does word2vec learn?

What exactly does word2vec learn, and how? Answering this question amounts to understanding representation learning in a minimal yet interesting language modeling task. Despite the fact that word2vec is a well-known precursor to modern language models, for many years, researchers lacked a quantitative and predictive theory describing its learning process. In our new paper, we finally provide such a theory. We prove that there are realistic, practical regimes in which the learning problem reduces to unweighted least-squares matrix factorization. We solve the gradient flow dynamics in closed form; the final learned representations are simply given by PCA.

Learning dynamics of word2vec. When trained from small initialization, word2vec learns in discrete, sequential steps. Left: rank-incrementing learning steps in the weight matrix, each decreasing the loss. Right: three time slices of the latent embedding space showing how embedding vectors expand into subspaces of increasing dimension at each learning step, continuing until model capacity is saturated.

Before elaborating on this result, let’s motivate the problem. word2vec is a well-known algorithm for learning dense vector representations of words. These embedding vectors are trained using a contrastive algorithm; at the end of training, the semantic relation between any two words is captured by the angle between the corresponding embeddings. In fact, the learned embeddings empirically exhibit striking linear structure in their geometry: linear subspaces in the latent space often encode interpretable concepts such as gender, verb tense, or dialect. This so-called linear representation hypothesis has recently garnered a lot of attention since LLMs exhibit this behavior as well, enabling semantic inspection of internal representations and providing for novel model steering techniques. In word2vec, it is precisely these linear directions that enable the learned embeddings to complete analogies (e.g., “man : woman :: king : queen”) via embedding vector addition.

Maybe this shouldn’t be too surprising: after all, the word2vec algorithm simply iterates through a text corpus and trains a two-layer linear network to model statistical regularities in natural language using self-supervised gradient descent. In this framing, it’s clear that word2vec is a minimal neural language model. Understanding word2vec is thus a prerequisite to understanding feature learning in more sophisticated language modeling tasks.

The Result

With this motivation in mind, let’s describe the main result. Concretely, suppose we initialize all the embedding vectors randomly and very close to the origin, so that they’re effectively zero-dimensional. Then (under some mild approximations) the embeddings collectively learn one “concept” (i.e., orthogonal linear subspace) at a time in a sequence of discrete learning steps.

It’s like when diving head-first into learning a new branch of math. At first, all the jargon is muddled — what’s the difference between a function and a functional? What about a linear operator vs. a matrix? Slowly, through exposure to new settings of interest, the words separate from each other in the mind and their true meanings become clearer.

As a consequence, each new realized linear concept effectively increments the rank of the embedding matrix, giving each word embedding more space to better express itself and its meaning. Since these linear subspaces do not rotate once they’re learned, these are effectively the model’s learned features. Our theory allows us to compute each of these features a priori in closed form – they are simply the eigenvectors of a particular target matrix which is defined solely in terms of measurable corpus statistics and algorithmic hyperparameters.

What are the features?

The answer is remarkably straightforward: the latent features are simply the top eigenvectors of the following matrix:

\[M^{\star}_{ij} = \frac{P(i,j) - P(i)P(j)}{\frac{1}{2}(P(i,j) + P(i)P(j))}\]

where $i$ and $j$ index the words in the vocabulary, $P(i,j)$ is the co-occurrence probability for words $i$ and $j$, and $P(i)$ is the unigram probability for word $i$ (i.e., the marginal of $P(i,j)$).

Constructing and diagonalizing this matrix from the Wikipedia statistics, one finds that the top eigenvector selects words associated with celebrity biographies, the second eigenvector selects words associated with government and municipal administration, the third is associated with geographical and cartographical descriptors, and so on.

The takeaway is this: during training, word2vec finds a sequence of optimal low-rank approximations of $M^{\star}$. It’s effectively equivalent to running PCA on $M^{\star}$.

The following plots illustrate this behavior.

Learning dynamics comparison showing discrete, sequential learning steps.

On the left, the key empirical observation is that word2vec (plus our mild approximations) learns in a sequence of essentially discrete steps. Each step increments the effective rank of the embeddings, resulting in a stepwise decrease in the loss. On the right, we show three time slices of the latent embedding space, demonstrating how the embeddings expand along a new orthogonal direction at each learning step. Furthermore, by inspecting the words that most strongly align with these singular directions, we observe that each discrete “piece of knowledge” corresponds to an interpretable topic-level concept. These learning dynamics are solvable in closed form, and we see an excellent match between the theory and numerical experiment.

What are the mild approximations? They are: 1) quartic approximation of the objective function around the origin; 2) a particular constraint on the algorithmic hyperparameters; 3) sufficiently small initial embedding weights; and 4) vanishingly small gradient descent steps. Thankfully, these conditions are not too strong, and in fact they’re quite similar to the setting described in the original word2vec paper.

Importantly, none of the approximations involve the data distribution! Indeed, a huge strength of the theory is that it makes no distributional assumptions. As a result, the theory predicts exactly what features are learned in terms of the corpus statistics and the algorithmic hyperparameters. This is particularly useful, since fine-grained descriptions of learning dynamics in the distribution-agnostic setting are rare and hard to obtain; to our knowledge, this is the first one for a practical natural language task.

As for the approximations we do make, we empirically show that our theoretical result still provides a faithful description of the original word2vec. As a coarse indicator of the agreement between our approximate setting and true word2vec, we can compare the empirical scores on the standard analogy completion benchmark: word2vec achieves 68% accuracy, the approximate model we study achieves 66%, and the standard classical alternative (known as PPMI) only gets 51%. Check out our paper to see plots with detailed comparisons.

To demonstrate the usefulness of the result, we apply our theory to study the emergence of abstract linear representations (corresponding to binary concepts such as masculine/feminine or past/future). We find that over the course of learning, word2vec builds these linear representations in a sequence of noisy learning steps, and their geometry is well-described by a spiked random matrix model. Early in training, semantic signal dominates; however, later in training, noise may begin to dominate, causing a degradation of the model’s ability to resolve the linear representation. See our paper for more details.

All in all, this result gives one of the first complete closed-form theories of feature learning in a minimal yet relevant natural language task. In this sense, we believe our work is an important step forward in the broader project of obtaining realistic analytical solutions describing the performance of practical machine learning algorithms.

Learn more about our work: Link to full paper

This post originally appeared on Dhruva Karkada’s blog.

AI Brings Old Photos Back to Life

Photo revival, fast food flop, AI startup playbook, unsafe AI, and more...

#llm applications #artificial intelligence #llm #raspberry pi #voice assistant #language detection

How to Develop a Bilingual Voice Assistant

Exploring ways to make voice assistants more personal
The post How to Develop a Bilingual Voice Assistant appeared first on Towards Data Science.

#intermediate #machine learning

Beyond AUC and RMSE: How to Align Offline Metrics with Real-World KPIs

For ML practitioners, the natural expectation is that a new ML model that shows promising results offline will also succeed in production. But often, that’s not the case. ML models that outperform on test data can underperform for real production users. This discrepancy between offline and online metrics is often a big challenge in applied […]
The post Beyond AUC and RMSE: How to Align Offline Metrics with Real-World KPIs appeared first on Analytics Vidhya.

#machine learning #career insights #data science #experiment design #producitivity #notebook

The Machine Learning Lessons I’ve Learned This Month

August 2025: logging, lab notebooks, overnight runs
The post The Machine Learning Lessons I’ve Learned This Month appeared first on Towards Data Science.

#math #deep dives #linear algebra #matrix #matrix multiplication #vector

Understanding Matrices | Part 4: Matrix Inverse

The physical meaning of matrix inversion, related formulas, and how inversion behaves on several special types of matrices.
The post Understanding Matrices | Part 4: Matrix Inverse appeared first on Towards Data Science.

#large language models #llm applications #perplexity #python #voice assistant #google assistant

Crafting a Custom Voice Assistant with Perplexity

How to build a fully functional, hands-free voice assistant on a Raspberry Pi
The post Crafting a Custom Voice Assistant with Perplexity appeared first on Towards Data Science.

Meta Considers Using GPT/Gemini

Meta’s bold move, major AI breach, AI’s secret life, TIME100 AI, AGI hype, and more...

#beginner #generative ai #machine learning

How Dentsu Uses Generative Machine Learning to Transform Customer Service?

In today’s dynamic business environment, a company’s approach to customer experience can significantly impact its brand perception. One poor interaction, such as a missed delivery or an unhelpful agent, and the relationship often doesn’t recover. Industry data puts it into perspective: Nearly 32% of consumers abandon a brand after just one bad experience. The stakes […]
The post How Dentsu Uses Generative Machine Learning to Transform Customer Service? appeared first on Analytics Vidhya.

#ai #business #security #ai cybersecurity #charlotte ai #cloud security #crowdstrike #deepfake detection #deepfake fraud #generative ai security #ivanti #palo alto networks #security service edge #ai inference security #automated remediation #biometric authentication #budget restructuring #calypsoai #cisos #consolidated cybersecurity platforms #continuous control monitoring #credential management #cybersecurity budgets #cybersecurity platform consolidation #cybersecurity spending trends #data classification #data discovery #falcon complete #forrester cybersecurity report #gen ai attacks #hardware cybersecurity spending #identity sprawl #inference layer security #integration overhead #interactive application security testing #machine identities #mdvm #microsecond response cybersecurity #microsoft security #nist quantum standards #phishing automation #platformization #protect ai acquisition #quantum computing threats #quantum-resistant cryptography #ransomware economics #regional cybersecurity spending #runtime defense architecture #security tool sprawl #software cybersecurity budgets #sub-second threat detection #supply chain attacks #trust centers #unified sase #zero-day exploits

Software commands 40% of cybersecurity budgets as gen AI attacks execute in milliseconds

Software spending now makes up 40% of cybersecurity budgets, with investment expected to grow as CISOs prioritize real-time AI defenses.

#ai #ai research #ai, ml and deep learning #large language models #large language models (llms) #llms #research #merging techniques #model merging

How Sakana AI’s new evolutionary algorithm builds powerful AI models without expensive retraining

M2N2 is a model merging technique that creates powerful multi-skilled agents without the high cost and data needs of retraining.

#ai #business #agents #ai, ml and deep learning #intuit #quickbooks

How Intuit killed the chatbot crutch – and built an agentic AI playbook you can copy

This is the inside story of Intuit's transformation journey with AI — including a grueling nine-month pivot to "burn the boats" and reinvent how the 40-year-old finance giant builds its products.

#gemini #pixel #ai

Learn what makes Pixel 10’s camera tech and AI features so special.

To kick off the second episode in Season 8 of the Made by Google podcast, host Rachid Finge asks Pixel Product Manager Stephanie Scott to describe the Pixel 10 phones in…

Claude Exploited for Cybercrime

AI cybercrime and ransomware, GPT-Realtime, AI NPCs, Anthropic deadline, and more...

#amazon bedrock #amazon bedrock agents #amazon bedrock guardrails #generative ai #monitoring and observability #partner solutions #security #ai/ml

Detect Amazon Bedrock misconfigurations with Datadog Cloud Security

We’re excited to announce new security capabilities in Datadog Cloud Security that can help you detect and remediate Amazon Bedrock misconfigurations before they become security incidents. This integration helps organizations embed robust security controls and secure their use of the powerful capabilities of Amazon Bedrock by offering three critical advantages: holistic AI security by integrating AI security into your broader cloud security strategy, real-time risk detection through identifying potential AI-related security issues as they emerge, and simplified compliance to help meet evolving AI regulations with pre-built detections.

Latest AI News & Updates

Select Categories