Latest AI News & Updates

#large language models #artificial intelligence #data science #llm applications #programming #python

What took GPT-4o 2 hours to solve, Sonnet 4.5 does in 5 seconds 
The post This Puzzle Shows Just How Far LLMs Have Progressed in a Little Over a Year appeared first on Towards Data Science.

The good news is that recognizing these red flags early can cut your attack costs in half. Prevention beats recovery every time.

One of the claims made by OpenAI regarding its latest model, GPT-5 , is a breakthrough in reasoning for math and logic, with the ability to “think” more deeply when a prompt benefits from careful analysis.

#ai

For many enterprises, there continue to be barriers to fully adopting and benefiting from agentic AI.IBM is betting the blocker isn't building AI agents but governing them in production.At its TechXchange 2025 conference today, IBM unveiled a series of capabilities designed to bridge the gap: Project Bob, an AI-first IDE that orchestrates multiple LLMs to automate application modernization; AgentOps for real-time agent governance; and the first integration of open-source Langflow into watsonx Orchestrate, IBM's platform for deploying and managing AI agents. IBM's announcements represent a three-pronged strategy to address interconnected enterprise AI challenges: modernizing legacy code, governing AI agents in production and bridging the prototype-to-production gap..The company claims 6,000 internal developers within IBM have used Project Bob, achieving an average productivity gain of 45% and a 22-43% increase in code commits .Project Bob isn't another vibe coder, it's an enterprise modernization toolThere is no shortage of AI-powered coding tools in the market today, including tools like GitHub Copilot and vibe coding tools such as Replit, Cursor, Bolt and Lovable."Project Bob takes a fundamentally different approach from tools like GitHub Copilot or Cursor," Bruno Aziza, IBM's Vice President of Data, AI and Analytics Strategy told VentureBeat.Aziza said that Project Bob is enterprise-focused and maintains full-repository context across editing sessions. It automates complex tasks like Java 8 to more modern version of Java and framework upgrades from Struts or JSF to React, Angular or Liberty.The architecture orchestrates between Anthropic's Claude, Mistral, Meta's Llama and IBM's recently released Granite 4 models through a data-driven model selection approach. The system routes tasks to whichever LLM is best suited, balancing accuracy, latency and cost in real time."It understands the entire repository, development intent and security standards, enabling developers to design, debug, refactor and modernize code without breaking flow,” he said.Among 6,000 early adopters within IBM, 95% used Bob for task completion rather than code generation. The tool integrates DevSecOps practices like vulnerability detection and compliance checks directly into the IDE."Bob goes beyond code assistance — it orchestrates intelligence across the entire software development lifecycle, helping teams ship secure, modern software faster," he said.Project Bob benefits from new Anthropic partnershipPart of Project Bob is a new partnership between IBM and AnthropicThe two vendors announced a partnership to integrate Claude models directly into the watsonx portfolio, starting with Project Bob. The collaboration extends beyond model integration to include what IBM describes as a first-of-its-kind guide for enterprise AI agent deployment.IBM and Anthropic co-created "A Guide to Architecting Secure Enterprise AI Agents with MCP Servers," focused on the Agent Development Lifecycle (ADLC). The ADLC framework provides a structured approach to designing, deploying and managing enterprise AI systems. MCP refers to Model Context Protocol, Anthropic's widely embraced open standard for connecting AI assistants to the systems and data they need to work with.Making it easier to build enterprise-grade AI agentsIn addition to Project Bob, IBM announced that it is extending its watsonx Orchestrate technology to integrate the open source Langflow visual agent builder. Langflow is an open-source technology that is led by DataStax, which itself was acquired by IBM in May of this year. The integration of Langflow is intended to address what Aziza calls the "prototype to production chasm.""Today, there's no seamless path from open-source prototyping to enterprise-grade systems that are reliable, compliant and scalable," Aziza said. "Watsonx Orchestrate transforms Langflow-like agentic composition into an enterprise-grade orchestration platform by adding governance, security, scalability, compliance, and operational robustness — making it production-ready for mission-critical use.” Aziza explained that the integration of Langflow with watsonx Orchestration brings critical capabilities on top of the open-source tool including:Agent lifecycle framework: Provisioning, versioning, deployment and monitoring with multi-agent coordination and role-based access.Integrated AI governance: Embedded watsonx.governance provides audit trails, explainability for agent decisions, bias and drift monitoring and policy enforcement. Langflow has no native governance controls.Enterprise infrastructure: SaaS or on-premises hosting with data isolation, SSO/LDAP integration and fine-grained permissions. Langflow users must manage their own infrastructure and security.No-code and pro-code options: Langflow is "low-code." IBM added a visual, no-code Agent Builder and a pro-code Agent Development Kit for seamless promotion from prototype to production.Pre-built domain agents: Catalog of HR, IT and finance agents integrated with Workday, SAP and ServiceNow.Production observability: Built-in dashboards, analytics and enterprise support SLAs with continuous performance monitoring.AgentOps and Agentic Workflows: From building to governingIBM is also introducing two new capabilities to watsonx Orchestrate that work in tandem with the Langflow integration: Agentic Workflows for standardized agent coordination and AgentOps for production governance. Agentic Workflows addresses what Aziza calls the "brittle scripts" problem. Today developers build agents using custom scripts that break when scaled across enterprise environments. Agentic Workflows provides standardized, reusable flows that sequence multiple agents and tools consistently. This connects directly to the Langflow integration. While Langflow provides the visual interface for building individual agents, Agentic Workflows handles the orchestration layer, coordinating multiple agents and tools into repeatable enterprise processes. AgentOps then provides the governance and observability for those running workflows. The new built-in observability layer provides real-time monitoring and policy-based controls across the full agent lifecycle.The governance gap becomes concrete in enterprise scenarios. Without AgentOps, an HR onboarding agent might set up benefits and payroll but teams lack visibility into whether it's applying policies correctly until problems surface. With AgentOps, every action is monitored in real time, allowing anomalies to be flagged and corrected immediately.What this means for enterprisesTechnical debt is something that many organizations struggle with and it can represent a non-trivial barrier for organizations looking to get into agentic AI deployments.

Project Bob's value proposition is clearest for organizations with significant legacy Java codebases. The 45% productivity gains IBM measured internally suggest meaningful acceleration for Java 8 to more modern versions of Java and framework upgrades from Struts or JSF to modern architectures. However, these metrics come from IBM developers working on IBM systems. The critical unknown is whether the multi-model orchestration delivers the same results on customer codebases with different architectural patterns, technical debt profiles and team skill levels.The Langflow integration addresses a genuine gap for teams already using open source agent frameworks. The challenge isn't building agents with tools like LangChain, LangGraph or n8n. It's adding the governance layer, lifecycle management, enterprise security controls and observability required for production deployment.For enterprises looking to lead in AI adoption, IBM's announcements serve to reinforce the fact that governance infrastructure is now table stakes. You can build agents quickly with existing tools. Scaling them safely requires the lifecycle management, observability and policy controls.Project Bob is now available in private tech preview with broader availability expected in the future. IBM is accepting access requests through its developer portal. Its AgentOps and agentic workflows integrations are now available in watsonx Orchestrate, while its Langflow integration is expected to be generally available at the end of this month.

#culture / digital culture #the big story #culture

Behold Neural Viz, the first great cinematic universe of the AI era. It's from a guy named Josh.

#artificial intelligence #app #the algorithm

Last week OpenAI released Sora, a TikTok-style app that presents an endless feed of exclusively AI-generated videos, each up to 10 seconds long. The app allows you to create a “cameo” of yourself—a hyperrealistic avatar that mimics your appearance and voice—and insert other peoples’ cameos into your own videos (depending on what permissions they set). …

#large language models #ai agent #context engineering #deep dives #llm #machine learning

Learn how to optimize the context of your agents, for powerful agentic performance
The post How to Perform Effective Agentic Context Engineering appeared first on Towards Data Science.

ChatGPT Apps, Rufus finds fakes, leopards hunted us, AMD + OpenAI, and more...

#business #business / artificial intelligence #gear #gear / gear news and events

“I don’t think we have an easy relationship with our technology at the moment,” the former Apple designer said at OpenAI’s developer conference in San Francisco on Monday.

#research #fusion #aeronautical and astronautical engineering #artificial intelligence #computer modeling #energy #machine learning #renewable energy #sustainability #laboratory for information and decision systems (lids) #nuclear power and reactors #school of engineering #mit schwarzman college of computing #fluid dynamics #plasma science and fusion center

The approach combines physics and machine learning to avoid damaging disruptions when powering down tokamak fusion machines.

#3-d printing #artificial intelligence #machine learning #materials science and engineering #mechanical engineering #research #school of engineering #mit.nano #dmse

Incorporating machine learning, MIT engineers developed a way to 3D print alloys that are much stronger than conventionally manufactured versions.

#ai

OpenAI launched an agent builder that the company hopes will eliminate fragmented tools and make it easier for enterprises to utilize OpenAI’s system to create agents. AgentKit, announced during OpenAI’s DevDay in San Francisco, enables developers and enterprises to build agents and add chat capabilities in one place, potentially competing with platforms like Zapier. By offering a more streamlined way to create agents, OpenAI advances further into becoming a full-stack application provider.“Until now, building agents meant juggling fragmented tools—complex orchestration with no versioning, custom connectors, manual eval pipelines, prompt tuning, and weeks of frontend work before launch,” the company said in a blog post. AgentKit includes:Agent Builder, which is a visual canvas where devs can see what they’ve created and versioning multi-agent workflowsConnector Registry is a central area for admins to manage connections across OpenAI products. A Global Admin console will be a prerequisite to using this feature.ChatKit enables users to integrate chat-based agents into their user interfaces. Eventually, OpenAI said it will build a standalone Workflows API and add agent deployment tabs to ChatGPT. OpenAI also expanded evaluation for agents, adding capabilities such as datasets with automated graders and annotations, trace grading that runs end-to-end assessments of workflows, automated prompt optimization, and support for third-party agent measurement tools. Developers can access some features of AgentKit, but OpenAI is gradually rolling out additional features, such as Agent Builder. Currently, Agent Builder is available in beta, while ChatKit and new evaluation capabilities are generally available. Connector Registry “is beginning its beta rollout to some API and ChatGPT Enterprise and Edu users. OpenAI said pricing for AgentKit tools will be included in the standard API model pricing. Agent BuilderTo clarify, many agents are built using OpenAI’s models; however, enterprises often access GPT-5 through other platforms to create their own agents. However, AgentKit brings enterprises more into its ecosystem, ensuring they don’t need to tap other platforms as often. Demonstrated during DevDay, the company stated that Agent Builder is ideal for rapid iteration. It also provides developers with visibility into how the agents are working. During the demo, an OpenAI developer made an agent that reads the DevDay agenda and suggests panels to watch. It took her just under eight minutes. Other model providers saw the importance of offering developer toolkits to build agents to entice enterprises to use more of their tools. Google came out with its Agent Development Kit in April, expanding multi-agent system building “in under 100 lines of code.” Microsoft, which runs the popular agent framework AutoGen, announced it is bringing agent creation to one place with its new Agent Framework.OpenAI customers, such as the fintech company Ramp, stated in a blog post that its teams were able to build a procurement agent in a few hours instead of months. “Agent Builder transformed what once took months of complex orchestration, custom code, and manual optimizations into just a couple of hours. The visual canvas keeps product, legal, and engineering on the same page, slashing iteration cycles by 70% and getting an agent live in two sprints rather than two quarters,” Ramp said. AgentKit’s Connector Registry would also enable enterprises to manage and maintain data across workspaces, consolidating data sources into a single panel that spans both ChatGPT and the API. It will have pre-built connectors to Dropbox, Google Drive, SharePoint and Microsoft Teams. It also supports third-party MCP servers. Another capability of Agent Builder is Guardrails, an open-source safety layer that protects against the leakage of personally identifiable information (PII), jailbreaks, and unintended or malicious behavior.Bringing more chat Since most agentic interactions involve chat, it makes sense to simplify the process for developers to set up chat interfaces and connect them with the agents they’ve just built. “Deploying chat UIs for agents can be surprisingly complex—handling streaming responses, managing threads, showing the model thinking and designing engaging in-chat experiences,” OpenAI said. The company said ChatKit makes it simple to embed chat agents on platforms and embed these into apps or websites. However, some OpenAI competitors have begun thinking beyond the chatbot and want to offer agentic interactions that feel more seamless. Google’s asynchronous coding agent, Jules, has introduced a new feature that enables users to interact with the agent through the command-line interface, eliminating the need to open a chat window. Responses The response to AgentKit has mainly been positive, with some developers noting that while it simplifies agent building, it doesn’t mean that everyone can now build agents. Several developers view Agent Kit not as a Zapier killer, but rather as a tool that complements the pipeline. Zapier debuted a no-code tool for building AI agents and bots, called Zapier Central, in 2024.

#business #business / artificial intelligence

At OpenAI’s Developer Day, CEO Sam Altman showed off apps that run entirely inside the chat window—a new effort to turn ChatGPT into a platform.

#advanced (300) #amazon sagemaker #customer solutions #uncategorized

In this post, we demonstrate how PowerSchool built and deployed a custom content filtering solution using Amazon SageMaker AI that achieved better accuracy while maintaining low false positive rates. We walk through our technical approach to fine tuning Llama 3.1 8B, our deployment architecture, and the performance results from internal validations.

#business #business / artificial intelligence

OpenAI’s latest move in the race to build massive data centers in the US shows it believes demand for AI will keep surging—even as skeptics warn of a bubble.

#ai #programming & development

OpenAI's annual conference for third-party developers, DevDay, kicked off with a bang today as co-founder and CEO Sam Altman announced a new "Apps SDK," or software development kit, that makes it "possible to build apps inside of ChatGPT," including paid apps, which companies can charge users for using OpenAI's recently unveiled Agentic Commerce Protocol (ACP). In other words, instead of launching apps one-by-one on your phone, computer, or on the web — now you can do all that without ever leaving ChatGPT. This feature allows the user to log-into their accounts on those external apps and bring all their information back into ChatGPT, and use the apps very similarly to how they already do outside of the chatbot, but now with the ability to ask ChatGPT to perform certain actions, analyze content, or go beyond what each app could offer on its own. You can direct Canva to make you slides based on a text description, ask Zillow for home listings in a certain area fitting certain requirements, or ask Coursera about a specific lesson's content while dit plays on video, all from within ChatGPT — with many other apps also already offering their own connections (see below)."This will enable a new generation of apps that are interactive, adaptive and personalized, that you can chat with," Altman said.While the Apps SDK is available today in preview, OpenAI said it would not begin accepting new apps within ChatGPT or allow them to charge users until "later this year."ChatGPT in-line app access is already rolling out to ChatGPT Free, Plus, Go and Pro users — outside of the European Union only for now — with Business, Enterprise, and Education tiers expected to receive access to the apps later this year.Built atop common MCP standardBuilt on the open source standard Model Context Protocol (MCP) introduced by rival Anthropic nearly a year ago, the Apps SDK gives third-party developers working independently or on behalf of enterprises large and small to connect selected data, "trigger actions, and render a fully interactive UI [user interface]" Altman explained during his introductory keynote speech. The Apps SDK includes a "talking to apps" feature that allows ChatGPT and the underlying GPT-5 or other "o-series" models piloting it underneath to obtain updated context from the third-party app or service, so the model "always knows about exactly what you're user is interacting with," according to another presenter and OpenAI engineer, Alexi Christakis.Developers can build apps that:appear inline in chat as lightweight cards or carouselsexpand to fullscreen for immersive tasks like maps, menus, or slidesuse picture-in-picture for live sessions such as video, games, or quizzesEach mode is designed to preserve ChatGPT’s minimal, conversational flow while adding interactivity and brand presence.Early integrations with Coursera, Canva, Zillow and more...Christakis showed off early integrations of external apps built atop the Apps SDK, including ones from e-learning company Coursera, cloud design software company Canva, and real estate listings and agent connections search engine, Zillow.Altman also announced Apps SDK integrations with additional partners not demoed officially during the keynote including: Booking.com, Expedia, Figma and Spotify and in documentation, said more upcoming partners are on deck: AllTrails, Peloton, OpenTable, Target, theFork, and Uber, representing lifestyle, commerce, and productivity categories.The Coursera demo included an example of how the user onboards to the external app, including a new login screen for the app (Coursera) that appears within the ChatGPT chat interface, activated simply by a text prompt from the user asking: "Coursera can you teach me something about machine learning"?Once logged in, the app launched within the chat interface, "in line" and can render anything from the web, including interactive elements like video. Christakis explained and showed the Apps SDK also supports "picture-in-picture" and "fullscreen" views, allowing the user to choose how to interact with it.When playing a Coursera video that appeared, he showed that it automatically pinned the video to the top of the screen so the user could keep watching it even as they continued to have a back-and-forth dialog in text with ChatGPT in the typical input/output prompts and responses below. Users can then ask ChatGPT about content appearing in the video without specifying exactly what was said, as the Agents SDK pipes the information on the backend, server-side, from the connected app to the underlying ChatGPT AI model. So "can you explain more about what they're saying right now" will automatically surface the relevant portion of the video and provide that to the underlying AI model for it to analyze and respond to through text.In another example, Christakis opened an older, existing ChatGPT conversation he'd had about his siblings' dog walking business and resumed the conversation by asking another third-party app, Canva, to generate a poster using one of ChatGPT's recommended business names, "Walk This Wag," along with specific guidance about font choice ("sans serif") and overall coloration and style ("bright and colorful.")Instead of the user manually having to go and add all those specific elements to a Canva template, ChatGPT went and issued the commands and performed the actions on behalf of the user in the background.After a few minutes, ChatGPT responded with several poster designs generated directly within the Canva app, but displayed them all in the user's ChatGPT chat session where they could see, review, enlarge and provide feedback or ask for adjustments on all of them.Christakis then asked for ChatGPT to turn one of the slides into an entire slide deck so the founders of the dog walking business could present it to investors, which did it in the background over several minutes while he presented a final integrated app, Zillow.He started a new chat session and asked a simple question: "based on our conversations, what would be a good city to expand the dog walking business."Using ChatGPT's optional memory feature, it referenced the dog walk conversation and suggested Pittsburgh, which Christakis used as a chance to type in "Zillow" and "show me some homes for sale there," which called up an interactive map from Zillow with homes for sale and prices listed and hover-over animations, all in-line within ChatGPT.Clicking a specific home also opened a fullscreen view with "most of the Zillow experience," entirely without leaving ChatGPT, including the ability to request home tours and contact agents and filtering by bedrooms and other qualities like outdoor space. ChatGPT pulls up the requested filtered Zillow search as well as provides a text-based response in-line explaining what it did and why.The user can then ask follow-up questions about the specific property — such as "how close is it to a dog park?" — or compare it to other properties, all within ChatGPT.It can also use apps in conjunction with its Search function, searching the web to compare the app information (in this case, Zillow) with other sources.Safety, privacy, and developer standardsOpenAI emphasized that apps must comply with strict privacy, safety, and content standards to be listed in the ChatGPT directory. Apps must:serve a clear and valuable purposebe predictable and reliable in behaviorbe safe for general audiences, including teens aged 13–17respect user privacy and limit data collection to only what’s necessaryEvery app must also include a clear, published privacy policy, obtain user consent before connecting, and identify any actions that modify external data (e.g., posting, sending, uploading).Apps violating OpenAI’s usage policies, crashing frequently, or misrepresenting their capabilities may be removed at any time. Developers must submit from verified accounts, provide customer support contacts, and maintain their apps for stability and compliance.OpenAI also published developer design guidelines, outlining how apps should look, sound, and behave. They must follow ChatGPT’s visual system — including consistent color palettes, typography, spacing, and iconography — and maintain accessibility standards such as alt text and readable contrast ratios.Partners can show brand logos and accent colors but not alter ChatGPT’s core interface or use promotional language. Apps should remain “conversational, intelligent, simple, responsive, and accessible,” according to the documentation.A new conversational app ecosystemBy opening ChatGPT to third-party apps and payments, OpenAI is taking a major step toward transforming ChatGPT from a chatbot into a full-fledged AI operating system — one that combines conversational intelligence, rich interfaces, and embedded commerce.For developers, that means direct access to over 800 million ChatGPT users, who can discover apps “at the right time” through natural conversation — whether planning trips, learning, or shopping.For users, it means a new generation of apps you can chat with — where a single interface helps you book a flight, design a slide deck, or learn a new skill without ever leaving ChatGPT.As OpenAI put it: “This is just the start of apps in ChatGPT, bringing new utility to users and new opportunities for developers.”There remain a few big questions, namely: 1. what happens to all the data from those third-party apps as they interface with ChatGPT and its users...does OpenAI get access to it and can it train upon it? 2. What happens to OpenAI's once much-hyped GPT Store, which had been in the past promoted as a way for third-party creators and developers to create custom, task-specific versions of ChatGPT and make money on them through a usage-based revenue share model?We've asked the company about both issues and will update when we hear back.

#data science #career advice #editors pick #generative ai tools #interview #producitivity

Practical AI hacks for every stage of the job search  — with real prompts and examples
The post How I Used ChatGPT to Land My Next Data Science Role appeared first on Towards Data Science.

#artificial intelligence #ai safety #deep dives #llm #llm applications #model validation

Exploring the most practical guardrails to implement at ground level
The post How To Build Effective Technical Guardrails for AI Applications appeared first on Towards Data Science.

#business #business / artificial intelligence

On this episode of Uncanny Valley, we break down some of the week's best stories, covering everything from Peter Thiel's obsession with the Antichrist to the launch of OpenAI’s new Sora 2 video app.

See why every Python developer should give TypeScript a serious look, and find out how to get productive fast.

#pixel #ai

These Pixel 10 and Google Photos features make it easier than ever to take great group photos.

#gemini models #google labs #ai #developers #ask a techspert

Learn more about AI and how it’s enabling new development tools, like “vibe coding.”

A hands-on guide to tracking experiments, versioning models, and keeping your ML projects reproducible with Weights & Biases.

#google deepmind #safety & security #ai

An overview of Google’s cohesive strategy for securing the AI ecosystem from the inside out.

CodeMender helps patch critical software vulnerabilities, and rewrites and secures existing code.

#data visualization #data science #plotly #dash #dash plotly #dashboard design

An easy starting point for larger and more complicated Dash dashboards
The post Plotly Dash — A Structured Framework for a Multi-Page Dashboard appeared first on Towards Data Science.

No recruiters contacted you recently? Here are 7 LinkedIn tricks to make you stand out.

#ai & ml #research

Just a few years ago, AI coding assistants were little more than autocomplete curiosities—tools that could finish your variable names or suggest a line of boilerplate. Today, they’ve become an everyday part of millions of developers’ workflows, with entire products and startups built around them. Depending on who you ask, they represent either the dawn […]

Time series data have the added complexity of temporal dependencies, seasonality, and possible non-stationarity.

#pixel #ai

In August, Alex Cooper appeared at our Made by Google event to help introduce Camera Coach, a new feature on Pixel 10. Today, we’re announcing another collaboration with…

« 1...3435363738...191»
×