AI

OpenAI establishes a team to study “catastrophic” AI risks, including nuclear threats.

OpenAI has recently established a new team called “Preparedness” to address and assess potential catastrophic risks associated with AI models. This initiative is led by Aleksander Madry, the director of MIT’s Center for Deployable Machine Learning, who joined OpenAI in the capacity of “head of Preparedness.” The team’s primary responsibilities encompass monitoring, forecasting, and safeguarding against various risks posed by future AI systems, ranging from their ability to deceive and manipulate humans (as seen in phishing attacks) to their potential for generating malicious code.

Image Credit: OpenAI

Preparedness is tasked with studying a range of risk categories, some of which may appear far-fetched, such as “chemical, biological, radiological, and nuclear” threats in the context of AI models. OpenAI CEO Sam Altman, known for expressing concerns about AI-related doomsday scenarios, is taking a proactive approach in preparing for such risks. The company is open to investigating both obvious and less apparent AI risks and is soliciting ideas from the community for risk studies, offering a $25,000 prize and job opportunities with the Preparedness team to top contributors.

In addition to risk assessment, the Preparedness team will work on formulating a “risk-informed development policy” to guide OpenAI’s approach to AI model evaluations, monitoring, risk mitigation, and governance structure. This approach complements OpenAI’s existing work in AI safety, focusing on both the pre- and post-model deployment phases. OpenAI acknowledges the potential benefits of highly capable AI systems but emphasizes the need to understand and establish infrastructure to ensure their safe use and operation. This announcement coincides with a major U.K. government summit on AI safety and follows OpenAI’s commitment to study and control emerging forms of “superintelligent” AI, driven by concerns about the potential for advanced AI systems to surpass human intelligence within the next decade.
You can read more details here from openAI blog.

Yuuma



Microsoft Azure AI Unveils Idea2Img: Transforming Image Development with Innovative Multimodal AI Framework

Microsoft Azure AI has unveiled a groundbreaking innovation in the realm of image development. They’ve introduced Idea2Img, a multimodal AI framework designed to simplify the process of transforming abstract concepts into tangible images, reducing the need for manual effort.

Idea2Img leverages the power of large multimodal models (LMMs) like GPT-4V to enable a self-refinement process. This iterative approach involves GPT-4V performing prompt generation, selecting draft images, and reflecting on feedback to continually improve results.

image credit : Microsoft

What sets Idea2Img apart is its integrated memory module, which tracks the history of exploration for each type of prompt, whether it’s a picture, text, or feedback. This constant interaction between the processes driven by GPT-4V is the key to Idea2Img’s impressive capabilities.

In practical scenarios involving intertwined picture-text sequences, visual design elements, and complex usage descriptions, Idea2Img excels. It can even extract intricate visual information from input images. To assess its effectiveness, the research team conducted user preference studies, comparing Idea2Img with various other models. The results were striking, with a remarkable 26.9% improvement when Idea2Img was paired with SDXL, underscoring its outstanding efficacy in the field.

In conclusion, Microsoft’s Idea2Img is a significant advancement in image development and design. By harnessing the potential of LMMs and iterative self-refinement, it promises to revolutionize the way we create visual assets from abstract ideas. Its adaptability in complex multimodal scenarios and substantial improvements in user preferences make it a game-changing innovation with far-reaching implications for businesses and industries reliant on image creation and design. It has the potential to enhance efficiency and output quality, ultimately leading to greater competitiveness and customer satisfaction.

Asahi



Google is focusing at Duolingo with new English tutoring tool

Google is making a significant move in the language learning space with a new feature in Google Search that aims to enhance users’ English speaking skills. Initially, this feature is rolling out to Android users in Argentina, Colombia, India, Indonesia, Mexico, and Venezuela, with plans to expand to more countries and languages in the future. This new tool offers interactive speaking practice and personalized feedback for learners translating to or from English, making it a valuable addition to Google’s language learning resources.

The personalized nature of this feature is a standout aspect. Google’s approach includes providing semantic feedback to assess the relevance and comprehensibility of a learner’s response to a given question. Additionally, it identifies areas where grammar improvements can be made and offers example answers at different language complexity levels. During practice sessions, users can also access contextual translations for any words they don’t understand, creating a holistic learning experience.

Image Credit: Google

To develop this feature, Google invested heavily in AI and machine learning. The Google Translate team created the Deep Aligner model for suggesting translations, and other research groups adapted grammar correction models for speech transcriptions, especially for users with accented speech. Google Research teams designed models for semantic feedback and sentence complexity estimation. To ensure a well-rounded language learning experience, Google collaborated with linguists, teachers, and ESL/EFL pedagogical experts, who contributed a mix of human-expert content, AI-assisted content, and in-house human-reviewed material.

While Google’s precise intentions with this feature remain unclear, it has the potential to boost user engagement. Although the blog post doesn’t explicitly indicate that Google is targeting established language learning apps like Duolingo, it is an intriguing move in a field with substantial profit potential. Google has previously ventured into language learning and education tools, and the success and direction of these efforts may depend on user adoption and popularity.

You can also check out the blog from Google here.

Yuuma



Google is releasing generative AI capabilities

Google is introducing generative AI capabilities to its popular Google Photos app through the release of the Pixel 8 and Pixel 8 Pro smartphones. Initially revealed at Google’s I/O developer conference in May, this feature enables more advanced photo edits, such as filling in gaps, repositioning subjects, and adjusting the foreground or background of images. Previously, achieving these effects required external tools like Google’s Magic Eraser or professional software like Photoshop, involving more manual effort.

Image Credit: Google

Using generative AI, Google Photos allows users to perform complex edits like resizing or repositioning subjects with ease. Users can tap on the object they wish to edit, drag it to move or pinch to resize, and make contextual adjustments to lighting and background. Magic Editor also offers multiple output options for user preference. However, Google acknowledges that the feature is in its early stages and may not always produce the desired results but hopes to improve it with user feedback and technological advancements.

The real-world testing of Magic Editor is imminent, and Google Photos users, especially those on the newer Pixel devices, will have the opportunity to try it out. With 1.7 billion photo edits made by Google Photos users monthly, the potential for learning and improvement is significant. This feature is part of a broader set of AI-powered photo-editing tools for the Pixel 8 and 8 Pro, including Best Take, Zoom Enhance, and enhancements to Magic Eraser, which will be available on Pixel 8 devices starting October 12.

Yuuma



Bing Chat に DALL-E 3 が搭載

Edgeブラウザ上などで利用できるBing ChatにDALL-E 3が搭載されたようです。

もともDALL-E 2を利用した画像生成は可能でした。
DALL-E 3に変わったことでプロンプトを細かく設定することができ、
より写実的な画像を出力可能になったそうです。

Bing Chatの入力欄に「〜の絵を描いてください」と入力することで
イラストを出力してくれました。

出力された画像の例を見る限りではありますが、
画像生成AIでよく発生する指の破綻が少ないように思えます。

水曜担当:Tanaka




アプリ関連ニュース

お問い合わせはこちら

お問い合わせ・ご相談はお電話、またはお問い合わせフォームよりお受け付けいたしております。

tel. 06-6454-8833(平日 10:00~17:00)

お問い合わせフォーム