Placeholder Image

Subtitles section Play video

  • China's Qinglong full-sized general-purpose humanoid robot was just released at the 2024 World Artificial Intelligence Conference, showcasing dexterous task completion by voice instruction.

  • But there's a twist.

  • This is because Qinglong is an open-source humanoid for China's attempt at creating a worldwide development ecosystem from within the existing robotics and AI communities.

  • To this end, the national-local Joint Humanoid Robot Innovation Center showcased its progress building the robot's capabilities and design concepts in a live demonstration.

  • And as for its specifications, Qinglong stands at 185 centimeters tall and weighs 80 kilograms.

  • It features a highly bionic torso and anthropomorphic motion control, enabling multimodal mobility, perception, interaction, and manipulation.

  • With 43 active degrees of freedom, Qinglong has a maximum joint peak torque of 200 newton meters per kilogram and computational power of 400 tera operations per second.

  • Its battery has a capacity of 2,052 watt-hours to power its 360-degree LiDAR system.

  • In terms of performance, the robot's design integrates lightweight legs for agile walking and high-precision arms for dexterous tasks.

  • As a result, Qinglong can walk rapidly around obstacles, move stably on inclines and declines, and even resist impact interference.

  • These features are to assist in making it an open platform for developing general artificial intelligence software and hardware for robots.

  • Next, in terms of development, Qinglong is a creation of Humanoid Robots Shanghai Limited, an R&D institution backed by leading industry enterprises with approximately 140 million USD in funding.

  • In the future, the team aims to introduce a new humanoid robot model annually, each named after one of the Chinese zodiac animals, with the name Qinglong meaning green dragon.

  • This is all with the end goal to build an open-source humanoid robot development community in China and beyond.

  • Additionally, the team also revealed the launch of the OpenLung open-source community as a platform to provide access to Qinglong's hardware structure and parameters, with the robot's embodied intelligence software packages to be open-sourced.

  • By creating an open-source culture, this initiative hopes to create a collaborative environment for the development of humanoid robotics worldwide.

  • But Qinglong was only one of the humanoids at the World Artificial Intelligence Conference, with Tesla's Optimus Gen 2 also being showcased for the very first time in public for this event.

  • The Tesla team highlighted significant advancements in weight reduction, flexibility and functionality, capturing the attention of AI enthusiasts and industry experts alike.

  • Key improvements are its in-house components, including actuators and sensors that were entirely designed and manufactured by Tesla, ensuring superior integration and performance.

  • Additionally, with a 30% increase in walking speed, Optimus Gen 2 can now operate even more efficiently.

  • Furthermore, the robot is 10 kilograms lighter than its predecessor, resulting in improved balance and battery lifespan.

  • Plus, its new hands are capable of gripping heavier objects and performing delicate tasks that expand the robot's range of potential applications.

  • And with all of this progress, Tesla plans to deploy Optimus Gen 2 in its own manufacturing factories to test the robot's practicality first.

  • This initial phase will allow the company to fine-tune the robot's capabilities in a controlled environment, and once proven effective, Tesla will then begin commercial sales of the robot, making it available to a broader market sometime in 2025.

  • But how can these robots quickly learn thousands of dexterous tasks to work with their human-like hands, you ask?

  • To this end, Meta just introduced a new dataset named HOT3D to advance artificial intelligence research in the realm of 3D hand-object interactions.

  • This extensive dataset comprises over one million frames captured from multiple perspectives, offering a rich resource for researchers delving into the complexities of how humans manipulate objects with their hands.

  • According to Meta, understanding these interactions remains a crucial challenge in computer vision research.

  • And the HOT3D dataset is a treasure trove of over 800 minutes of egocentric video recordings, with these recordings being synchronized across multiple perspectives and including high-quality 3D pose annotations for both hands and objects.

  • The dataset also features 3D object models, with physically-based rendering materials, 2D bounding boxes, gaze signals, and 3D scene-point clouds generated from simultaneous localization and mapping techniques.

  • Impressively, this dataset encapsulates interactions from 19 subjects handling 33 common household objects, with scenarios ranging from simple activities, such as picking up and examining are typical in kitchen, office, and living room environments.

  • And in another breakthrough, a new text to video competitor named Kling just launched its web version.

  • This allows users to generate videos up to 10 seconds long, with the enhanced model providing control over camera movements such as panning, tilting, and zooming, giving creators more flexibility and creativity.

  • Users can also freely create up to three 10-second videos per day using the enhanced model, making it accessible to a wide audience.

  • The only catch is that currently a Chinese cell phone number is still required to register with Kling, even for the web version.

  • Finally, Meta has announced their breakthrough AI system named 3DGen that's capable of generating high-quality 3D objects from simple text descriptions in less than one minute.

  • This innovation will likely revolutionize the creation of 3D content, making it faster and more accessible than ever before.

  • In fact, 3DGen leverages the power of not one, but two of Meta's existing models, AssetGen and TextureGen.

  • AssetGen is responsible for generating the initial 3D objects, while TextureGen handles the texturing process.

  • By integrating these models, Meta has created a system that produces superior 3D objects efficiently, enhancing the quality of immersive content creation.

  • The operation of 3DGen involves two key steps.

  • In the first step, AssetGen creates a 3D object complete with texture and physical-based rendering from a text description in approximately 30 seconds.

  • This rapid generation sets the stage for high-quality asset creation.

  • In the second step, TextureGen can either refine the texture of the object generated in the first step based on the original description, or create a texture for any unstructured 3D mesh using new specifications.

  • This process takes about 20 seconds per object.

  • Thanks to PBR support, the generated assets can be relit, adding another layer of realism and versatility.

  • Both stages of 3DGen are built upon Meta's text-to-image models from the EMU family.

  • These models are optimized with synthetic 3D data to enable multi-view generation in both image and UV space, which significantly enhances the texture quality of the final product.

  • According to Meta, 3DGen not only outpaces existing methods, but also delivers superior results.

  • User studies have shown that professional 3D artists prefer 3DGen over leading industry solutions, particularly for complex tasks.

  • Meta claims that 3DGen is three to sixty times faster than comparable systems, a significant leap that could drastically reduce the time and effort required for 3D content creation.

  • Overall, Meta envisions 3DGen as a pivotal step toward a future where personalized, user-generated 3D content becomes commonplace.

  • The system's ability to quickly and accurately generate high-quality 3D objects from text descriptions opens up numerous possibilities.

  • AI-assisted 3D creation tools could enable users to build intricate virtual worlds in the metaverse, offering new opportunities for creativity and interaction.

  • AI itself could even use this same tech in the future to create video games in real time, but that's for another video.

China's Qinglong full-sized general-purpose humanoid robot was just released at the 2024 World Artificial Intelligence Conference, showcasing dexterous task completion by voice instruction.

Subtitles and vocabulary

Click the word to look it up Click the word to find further inforamtion about it