The new operators will help enterprises integrate Vertex AI’s generative AI models into data pipelines orchestrated by Apacher Airflow and its managed workflow orchestration service Cloud Composer. Credit: Uladzik Kryhin / Shutterstock Google Cloud has introduced three new Apache Airflow operators within its AI service, Vertex AI. Apache Airflow, which can be thought of as an upgraded version of a cron job scheduler written in Python, helps enterprises connect data systems so that data flows between them. Essentially, Airflow paves the way for developers to understand how data flows between two data systems inside an enterprise. The new Airflow operators include —TextGenerationModelPredictOperator, TextEmbeddingModelGetEmbeddingsOperator, and GenerativeModelGenerateContentOperator — which can be used to generate text predictions, text embeddings, and other content generation, the company said in a blog post. These integrations will open up new ways for enterprises to perform data analytics using pipelines and will result in use cases such as automated insights, data enrichment, advanced anomaly detection, generation of content and text embeddings, and translation, Google said. Automated Insights use cases could include generating summaries, reports, and other insights from raw data, Christian Yarros, strategic cloud engineer at Google, explained in the blog post. Data enrichment as a use case, according to Yarros, would include enhancing datasets with synthetic data via generative AI models. The text embedding functionality of the operators can be used to take huge amounts of unstructured text and turn it into a structured format, allowing enterprises to dissect it and derive insights from it, Yarros wrote, adding that the content generation functionality can be used to provide DAG metadata such as descriptions, tags, and document values. Some of the real-world applications of combining Apache Airflow and Vertex AI, according to the company, could be targeted marketing, data cleansing, and coalescing reports. Enterprises can use Airflow to schedule and orchestrate an email campaign optimization process, Yarros wrote, explaining that once customer data is stored in Google Cloud storage, developers can use a generative model Airflow operator to analyze the customer data to create multiple personalized subject lines and content options for each customer segment. Another way to use the operators would be to represent visual content in new ways. According to Yarros, this can be done by creating an Airflow DAG that triggers when image or video files are uploaded to Google Cloud storage. Further, these operators can also be used for cost optimization, the company said, adding that enterprises can use an Airflow DAG to collect cloud resource usage data from monitoring APIs daily or hourly. “Deploy a Google Generative Model trained on historical usage patterns and reference the model in your Google Generative Model Airflow Operators to analyze the data and identify unusual spikes in CPU usage, network traffic, or storage consumption,” Yarros wrote, adding that if significant anomalies are detected, alerts can be sent to the infrastructure team for investigation and corrective action. Related content analysis And the #1 Python IDE is . . . PyCharm, VS Code, and five other popular Python IDEs duke it out. Which one do you think takes home the prize? By Serdar Yegulalp Nov 15, 2024 2 mins Python Programming Languages Software Development news JetBrains IDEs ease debugging for Kubernetes apps Version 2024.3 updates to IntelliJ, PyCharm, WebStorm, and other JetBrains IDEs streamline remote debugging of Kubernetes microservices and much more. By Paul Krill Nov 14, 2024 3 mins Integrated Development Environments Java Python analysis Python is the most popular language on GitHub Python was in the spotlight all last month, with a new release and a couple of big wins. Here are our picks for the best news and tutorials for Python developers in October. By Serdar Yegulalp Nov 01, 2024 2 mins Python Programming Languages Software Development feature Python threading and subprocesses explained Python lets you parallelize workloads using threads, subprocesses, or both. Here's what you need to know about Python's thread and process pools and Python threads after Python 3.13. By Serdar Yegulalp Oct 30, 2024 9 mins Concurrency Python Programming Languages Resources Videos