
?¼ÓÄôópc28Õ¹ÍûßäÅÆ×îÐÂЧ¹û|Õ¹ÍûßäÅÆ?ΪÄãÌṩ¼ÓÄôópc28Õ¹ÍûßäÅÆ×îÐÂЧ¹û|Õ¹ÍûßäÅÆAPP°²×¿°æÏÂÔØ£¬£¬£¬£¬£¬£¬£¬£¬ÀúÊ·°æ±¾¡¢¾É°æÏÂÔØ£¬£¬£¬£¬£¬£¬£¬£¬Éó²é×îмÓÄôópc28Õ¹ÍûßäÅÆ×îÐÂЧ¹û|Õ¹ÍûßäÅÆÊÖ»ú°æÏÈÈÝ¡¢Ó¦ÓýØÍ¼¡¢ÍøÓÑ̸ÂÛ£¬£¬£¬£¬£¬£¬£¬£¬Àû±ã¿ì½ÝµÄ½«°²×¿°æ¼ÓÄôópc28Õ¹ÍûßäÅÆ×îÐÂЧ¹û|Õ¹ÍûßäÅÆÓ¦ÓÃÃâ·ÑÏÂÔØµ½ÊÖ»ú¡£¡£¡£¡£¡£¡£
ÊÇÒ»¿îÕ¼ÁìÇøÓòµÄÐÝÏÐÒæÖÇÓÎÏ·£¬£¬£¬£¬£¬£¬£¬£¬Íæ¼Òͨ¹ýÊÓ²ìµØÍ¼°å¿éÉϵÄʱ¼ä£¬£¬£¬£¬£¬£¬£¬£¬¾ÙÐÐÉä»÷È»ºóÕ¼ÁìÇøÓò£¬£¬£¬£¬£¬£¬£¬£¬Ö¸µ¼ÇøÓò¶¼±äΪ×Ô¼ºÁìµØÑÕÉ«£¬£¬£¬£¬£¬£¬£¬£¬ÎªËùÓÐÕ¼Áì¡£¡£¡£¡£¡£¡£Ò»ÆðÀ´Ìôսһϰɣ¡
ÊÇÒ»¿îÊ®·ÖÒâ¼ûÒâÒåµÄɽÑòÄ£ÄâÓÎÏ·£¬£¬£¬£¬£¬£¬£¬£¬ÓëÖ÷»ú°æÓÎÏ·ÏàͬµÄ¿ª·ÅÌìϹ©Äã̽Ë÷ºÍÆÆË𣬣¬£¬£¬£¬£¬£¬£¬ÓÃÍ·é³×²»÷ƽÃñ£¬£¬£¬£¬£¬£¬£¬£¬ÎÞÖ¤¼ÝÊ»£¬£¬£¬£¬£¬£¬£¬£¬»òÕß¼ÓÈëè¤Ù¤¿Î£¡¸ÐÐËȤµÄÅóÙ¿ÉÒÔÀ´ÏÂÔØÌåÑ飡
ÊÇÒ»¿îÊ®·ÖºÃÍæµÄÐÝÏÐÄ£Äâı»®ÓÎÏ·£¬£¬£¬£¬£¬£¬£¬£¬ÓÎÏ·µ±ÖÐÍæ¼Ò¿ÉÒÔÑз¢ÖÖÖÖÃÀʳ£¬£¬£¬£¬£¬£¬£¬£¬ÎüÒý;¾µÄ½ºþÈËÊ¿£¬£¬£¬£¬£¬£¬£¬£¬À©´ó×Ô¼ºµÄ¿ÍÕ»£¬£¬£¬£¬£¬£¬£¬£¬Ë¢Ð²î±ðµÄ×°ÐÞÆø¸Å£¬£¬£¬£¬£¬£¬£¬£¬ÎªÍæ¼Ò´øÀ´È«ÐµÄÊÓ¾õÌåÑé¡£¡£¡£¡£¡£¡£
ÊÇÒ»¿îÎÄ×Ö¿Æ»ÃðÏÕÓÎÏ·£¬£¬£¬£¬£¬£¬£¬£¬Íæ¼ÒÔÚÓÎÏ·ÖÐÌåÑéÒâ¼ûÒâÒåµÄðÏÕ´³¹Ø£¬£¬£¬£¬£¬£¬£¬£¬Ã¿´ÎÑ¡Ôñ½«»á¾öÒé·×ÆçÑùµÄ¹ÊÊÂ×ßÏò¡£¡£¡£¡£¡£¡£ÔÚÕâ¸öÓÎÏ·ÄÚÀï¿ÆÑÐÖ°Ô±ÐèÒªÑо¿ÓîÖæºÚ¶´µÄÐþÃ£¬£¬£¬£¬£¬£¬£¬ÔÚÕâÄÚÀï¹ã¸æ½âËøÖÖÖÖ¾çÇ飬£¬£¬£¬£¬£¬£¬£¬Íæ·¨½ÏÁ¿¶àÔª»¯µÄÓÎÏ·!ÄãµÄÄ¿µÄÊÇÍÌÊÉÒ»ÇС£¡£¡£¡£¡£¡£
ÊÇÒ»¿îÃÔÈËµÄºá°æÏñËØ½âÃÕðÏÕ´³¹ØÓÎÏ·£¬£¬£¬£¬£¬£¬£¬£¬ÓÎÏ·ÒÔÉñÃØµÄÉÁÖÉ½ÇøÎªÅä¾°£¬£¬£¬£¬£¬£¬£¬£¬Íæ¼Ò»¯ÉíÓ¸ҵÄðÏÕÕߣ¬£¬£¬£¬£¬£¬£¬£¬ÔÚÕâ¸ö³äÂúÆæÒìÑ×ÌìµÄÌìÏÂÕö¿ªÌ½Ë÷¡£¡£¡£¡£¡£¡£ÔÚÓÎÏ·ÖУ¬£¬£¬£¬£¬£¬£¬£¬Íæ¼Ò½«½â¿ªÃÕÌ⡢սʤ¹ÖÎï¡¢ÍøÂçÎïÆ·£¬£¬£¬£¬£¬£¬£¬£¬Öð²½½Ò¿ªÕâ¸öÆæÒìÑ×ÌìµÄÉñÃØ¡£¡£¡£¡£¡£¡£
ÊÇÒ»¿îÊ®·ÖºÃÍæµÄÐÝÏÐÄ£ÄâÈËÉúÓÎÏ·£¬£¬£¬£¬£¬£¬£¬£¬Íæ¼Ò½«´ÓÒ»ÃûͨË×µÄİͷС··×öÆð£¬£¬£¬£¬£¬£¬£¬£¬Ò»Ö±µÄı»®×Ô¼ºµÄ̯룬£¬£¬£¬£¬£¬£¬£¬×¬È¡¸ü¶àµÄÇ®²Æ£¬£¬£¬£¬£¬£¬£¬£¬×îÖÕ³ÉΪ¸»ÎÌ¡£¡£¡£¡£¡£¡£
ÏëÏóÕâÑùÒ»¸ö³¡¾°£ºÒ»¸ö»úеÈ˼ÓÄôópc28Õ¹ÍûßäÅÆ×îÐÂЧ¹û|Õ¹ÍûßäÅÆÑÛǰµÄ×ÀÉϰÚ×ÅÒ»¶ÑÎïÌ壬£¬£¬£¬£¬£¬£¬£¬ÆäÖаüÀ¨ºìÉ«»ýľºÍÀ¶É«»ýľ£¬£¬£¬£¬£¬£¬£¬£¬´Ëʱ£¬£¬£¬£¬£¬£¬£¬£¬ÈôÊǸø³öÈÃËü¡°°ÑºìÉ«»ýľµþµ½À¶É«»ýľÉÏ¡±µÄÖ¸Á£¬£¬£¬£¬£¬£¬£¬ËüÊÇ·ñÄÜ˳ËìÖ´ÐУ¿£¿£¿£¿£¿£¿£¿
£¨ÈªÔ´£ºNano Banana Pro£©
¶ÔÈËÀàÀ´Ëµ£¬£¬£¬£¬£¬£¬£¬£¬Õâ¸öʹÃüÌýÆðÀ´Ê®·Ö¼òÆÓ¡£¡£¡£¡£¡£¡£ÓÉÓÚÄãÐèÒª×öµÄÖ»ÊÇÑÛ¾¦¶¢×źìÉ«»ýľ£¬£¬£¬£¬£¬£¬£¬£¬ÉìÊÖÄÃÆð£¬£¬£¬£¬£¬£¬£¬£¬È»ºóËø¶¨À¶É«»ýľ£¬£¬£¬£¬£¬£¬£¬£¬·Åµ½ÉÏÃæ¾Í´ó¹¦¸æ³É¡£¡£¡£¡£¡£¡£µ«¶ÔÄ¿½ñ´ó´ó¶¼»úеÈ˶øÑÔ£¬£¬£¬£¬£¬£¬£¬£¬ÊÂÇé²¢·ÇÔÆÔÆ¡£¡£¡£¡£¡£¡£ÔÚÐí¶àʱ¼ä£¬£¬£¬£¬£¬£¬£¬£¬»úеÈ˵ÄʹÃüÀÖ³ÉÂʲ¢²»Îȹ̣¬£¬£¬£¬£¬£¬£¬£¬ËüÓпÉÄܾ¶Ö±È¥Ê¹ÓÃÆäËûÎïÌ壬£¬£¬£¬£¬£¬£¬£¬¶ø·ÇÄ¿µÄÎïÌå¡£¡£¡£¡£¡£¡£
ÎªÊ²Ã´ÔÆÔÆ¼òÆÓµÄʹÃü£¬£¬£¬£¬£¬£¬£¬£¬¹ØÓÚ»úеÈËÀ´ËµÈ´ÎÞ·¨ÎȹÌÍê³ÉÄØ£¿£¿£¿£¿£¿£¿£¿
ÕâÊÇÓÉÓÚÄ¿½ñÖ÷Á÷µÄ VLA Ä£×Ó£¨Vision-Language-Action Model£¬£¬£¬£¬£¬£¬£¬£¬ÊÓ¾õ-ÓïÑÔ-Ðж¯Ä£×Ó£©ÔÚÖ´ÐÐץȡʹÃüʱ£¬£¬£¬£¬£¬£¬£¬£¬ÊÓ¾õ×¢ÖØÁ¦ÍùÍù³ÊÃÖÉ¢ÂþÑÜ¡£¡£¡£¡£¡£¡£Ò²¾ÍÊÇ˵£¬£¬£¬£¬£¬£¬£¬£¬Ä£×ÓËäÄÜÊä³öÐж¯ÐòÁУ¬£¬£¬£¬£¬£¬£¬£¬µ«ÆäÄÚ²¿×¢ÖØÁ¦²¢Î´ÕæÕý¾Û½¹ÓÚÖ¸ÁîËùÖ¸µÄÄ¿µÄÎïÌ壨ÈçºìÉ«»ýľ£©£¬£¬£¬£¬£¬£¬£¬£¬¶øÊÇÊèÉ¢ÔÚͼÏñ¶à¸öÇøÓò¡£¡£¡£¡£¡£¡£
ÕâÖÖ×¢ÖØÁ¦´íλֱ½Óµ¼Ö»úеÈ˵IJÙ×÷ʧÎ󣬣¬£¬£¬£¬£¬£¬£¬ÀýÈçÈÝÒ×ץȡ¹ýʧÎïÌ壬£¬£¬£¬£¬£¬£¬£¬»òÔÚ¶àÎïÌåÇéÐÎÖж¨Î»½û¾ø¡£¡£¡£¡£¡£¡£
ΪÏàʶ¾öÕâÒ»ÎÊÌ⣬£¬£¬£¬£¬£¬£¬£¬À´×ÔÏã¸Û¿Æ¼¼´óѧ£¨¹ãÖÝ£©ÓëÎ÷ºþ´óѧµÈ¸ßУµÄÒ»Ö§ÍŽáÑо¿ÍŶӣ¬£¬£¬£¬£¬£¬£¬£¬ÔÚ¶ÔÖ÷Á÷ VLA Ä£×Ó¾ÙÐÐϵͳÆÊÎöºó£¬£¬£¬£¬£¬£¬£¬£¬Ìá³öÁË ReconVLA£¨ÖØÐÞʽÊÓ¾õ-ÓïÑÔ-Ðж¯Ä£×Ó£©¡£¡£¡£¡£¡£¡£
ÔÚÕâ¸öÄ£×ÓÖУ¬£¬£¬£¬£¬£¬£¬£¬ËûÃÇÁ¢ÒìÐÔµØÒýÈëÁËÒ»ÖÖÃûΪ"Òþʽ¶¨Î»"£¨Implicit Grounding£©µÄѵÁ··¶Ê½¡£¡£¡£¡£¡£¡£²»ÔÚÍÆÀí½×¶ÎÌØÊâ¼ÓÄ£¿£¿£¿£¿£¿£¿£¿é£¬£¬£¬£¬£¬£¬£¬£¬²»Êä³ö½çÏß¿ò£¬£¬£¬£¬£¬£¬£¬£¬¶øÊÇÔÚѵÁ·Àú³ÌÖУ¬£¬£¬£¬£¬£¬£¬£¬Í¨¹ýÈÃÄ£×ÓÖØÐÞÄ¿µÄ²Ù×÷ÇøÓòµÄͼÏñ£¬£¬£¬£¬£¬£¬£¬£¬ÆÈʹËüÔÚÊÓ¾õ±àÂë½×¶Î¾Í°Ñ×¢ÖØÁ¦¼¯Öе½×¼È·µÄµØ·½¡£¡£¡£¡£¡£¡£
ͼ | ÍŶÓÂÛÎÄ£¨ÈªÔ´£ºarXiv£©
¾ÝϤ£¬£¬£¬£¬£¬£¬£¬£¬¸ÃÊÂÇéÓÚÉÏÔÂÈÙ»ñ AAAI ¹ú¼ÊÏȽøÈ˹¤ÖÇÄÜлá 2026 µÄÓÅÒìÂÛÎĽ±£¨Outstanding Paper Award£©¡£¡£¡£¡£¡£¡£
ÂÛÎĵÚÒ»×÷ÕßËÎÎÄÐù¸æËß DeepTech£¬£¬£¬£¬£¬£¬£¬£¬Ä¿½ñ VLA Ä£×ÓµÄÖ÷Á÷¼Ü¹¹£¬£¬£¬£¬£¬£¬£¬£¬Í¨³£ÒÔÒ»¸öԤѵÁ·ºÃµÄ VLM ΪÖ÷¸É£¬£¬£¬£¬£¬£¬£¬£¬ÔÙ½ÓÉÏÒ»¸öÐж¯ÌìÉúÍ·£¨action head£©£¬£¬£¬£¬£¬£¬£¬£¬ÓÃÀ´Êä³ö»úеÈ˵ĿØÖÆÐźš£¡£¡£¡£¡£¡£Õâ¸ö¼Ü¹¹À£¬£¬£¬£¬£¬£¬£¬VLM ÈÏÕæ¡°¿´¡±ºÍ¡°Ã÷È·¡±£¬£¬£¬£¬£¬£¬£¬£¬Ðж¯Í·ÈÏÕæ¡°×ö¡±¡£¡£¡£¡£¡£¡£
ͼ | ËÎÎÄÐù £¨ÈªÔ´£º±»·ÃÕߣ©
ÎÊÌâÔÚÓÚ£¬£¬£¬£¬£¬£¬£¬£¬VLM ×î³õÊÇΪͼÏñÃ÷È·ºÍ¶Ô»°Ê¹ÃüÉè¼ÆµÄ£¬£¬£¬£¬£¬£¬£¬£¬Ëüѧµ½µÄÊÓ¾õ±íÕ÷£¨visual representation£©×ÅÖØÓïÒå²ãÃæ¡£¡£¡£¡£¡£¡£ºÃ±Èʶ±ðÒ»ÕÅͼÀïÓÐʲô¹¤¾ß¡¢ËüÃÇÖ®¼äµÄ¹ØÏµ¡£¡£¡£¡£¡£¡£µ«¶Ô»úеÈ˲ٿØÀ´Ëµ£¬£¬£¬£¬£¬£¬£¬£¬ÕæÕýÖ÷ÒªµÄ²»µ«ÊÇ¡°Í¼ÀïÓÐʲô¡±£¬£¬£¬£¬£¬£¬£¬£¬¶øÊÇ¡°ÎÒ¸ÃÈ¥²Ù×÷ÄÇÀ£¬£¬£¬£¬£¬£¬£¬£¬ÕâÉæ¼°µ½²Ù×÷¿É¹©ÐÔ£¨affordance£©µÄ¸ÐÖª£¬£¬£¬£¬£¬£¬£¬£¬ÊÇÒ»ÖÖ VLM ÔÉúѵÁ·Ä¿µÄÖв¢²»°üÀ¨µÄÄÜÁ¦¡£¡£¡£¡£¡£¡£
ÁíһλÍŶӽ¹µã³ÉÔ±Ö®Ò»¶¡ÅôÏèÔö²¹£¬£¬£¬£¬£¬£¬£¬£¬Í¨ÓÃÊÓ¾õÄ£×Ӻ;ßÉí¿ØÖÆÊ¹ÃüÖ®¼ä±£´æÏÔÖøµÄÁìÓò²î±ð£¨domain gap£©¡£¡£¡£¡£¡£¡£¼´±ã VLM ÔÚͼÏñÃ÷È·Éϼ«ÎªÇ¿Ê¢£¬£¬£¬£¬£¬£¬£¬£¬ËüҲδ±ØÄÜ×ÔȻǨáãµ½»úеÈ˳¡¾°¡£¡£¡£¡£¡£¡£ÕâÖÖÄÜÁ¦È±Ê§Ö±½ÓÌåÏÖΪÊÓ¾õ×¢ÖØÁ¦µÄ¸ß¶ÈÃÖÉ¢¡£¡£¡£¡£¡£¡£
ͼ | ¶¡ÅôÏè £¨ÈªÔ´£ºÊÜ·ÃÕߣ©
ÔÚ¼òÆÓ³¡¾°Ï£¬£¬£¬£¬£¬£¬£¬£¬×¢ÖØÁ¦ÊèÉ¢»òÐíÓ°Ïì²»´ó¡£¡£¡£¡£¡£¡£ÈôÊÇ×ÀÉÏÖ»ÓÐÒ»¸öÎïÌ壬£¬£¬£¬£¬£¬£¬£¬¾ÍËã×¢ÖØÁ¦É¢Ò»µã£¬£¬£¬£¬£¬£¬£¬£¬Ä£×ÓÒ²»òÐíÂÊÄÜ×¥¶Ô¡£¡£¡£¡£¡£¡£µ«Ò»µ©³¡¾°±äµÃÔÓÂÒ£¬£¬£¬£¬£¬£¬£¬£¬ºÃ±È×ÀÃæÉϰÚÁËÎåÁùÑù¹¤¾ß£¬£¬£¬£¬£¬£¬£¬£¬Æ¶À§¾ÍÀ´ÁË¡£¡£¡£¡£¡£¡£
ÍŶÓʵÑéÅú×¢£¬£¬£¬£¬£¬£¬£¬£¬´Ëʱ£¬£¬£¬£¬£¬£¬£¬£¬¡°Ä£×ÓÍùÍù¿´µ½ÎïÌå¾Í×¥¡±¡£¡£¡£¡£¡£¡£Ö»ÒªÍó²¿Ïà»úÊÓÒ°ÖзºÆð¿ÉץȡĿµÄ£¬£¬£¬£¬£¬£¬£¬£¬Ä£×Ó¾ÍÇãÏòÓÚÖ´ÐÐץȡÐж¯£¬£¬£¬£¬£¬£¬£¬£¬ÖÁÓÚ×¥µÄÊDz»ÊÇÈËÀàÆÚÍûµÄËÈË£¬£¬£¬£¬£¬£¬£¬£¬Ëü²¢²»×ÜÊÇÌåÌù¡£¡£¡£¡£¡£¡£×¥È¡Ðж¯µÄÀÖ³ÉÂÊ¿ÉÄܸܺߣ¬£¬£¬£¬£¬£¬£¬£¬µ«×¥È¡µÄ׼ȷÂÊÔòÊÇÁíÒ»»ØÊ¡£¡£¡£¡£¡£¡£Õâ˵Ã÷Ä£×ÓÍê³ÉÁËÐж¯²ãÃæµÄʹÃü£¬£¬£¬£¬£¬£¬£¬£¬È´Ã»ÓÐÓëÈËÀàÒâͼ¶ÔÆë£¨intent alignment£©¡£¡£¡£¡£¡£¡£
ÁíÒ»¸ö¸üÒþ²ØÎÊÌâ·ºÆðÔÚ³¤³ÌʹÃü£¨long-horizon task£©ÖС£¡£¡£¡£¡£¡£Ëùν³¤³ÌʹÃü£¬£¬£¬£¬£¬£¬£¬£¬¾ÍÊÇÐèÒª¶à¸ö°ì·¨ÒÀ´ÎÍê³ÉµÄ²Ù×÷Á´¡£¡£¡£¡£¡£¡£ÓÉÓÚ×ÝȻÿһ²½ÄÄÅÂÖ»ÓÐϸСÎó²î£¬£¬£¬£¬£¬£¬£¬£¬Îó²î»áÖð²½ÀÛ»ý£¬£¬£¬£¬£¬£¬£¬£¬µ½ºóÃæ¼¸²½Ê±£¬£¬£¬£¬£¬£¬£¬£¬ÏµÍ³×´Ì¬¿ÉÄÜÒѾƫÀëÁËѵÁ·Êý¾ÝµÄÂþÑÜ¡£¡£¡£¡£¡£¡£¶¡ÅôÏè¾ÙÁËÒ»¸öÖ±¹ÛµÄÊý×Ö£º¼´±ãµ¥²½ÀÖ³ÉÂʸߴï 99%£¬£¬£¬£¬£¬£¬£¬£¬Ò»Á¬Ö´ÐÐ 100 ²½ºóµÄÕûÌåÀÖ³ÉÂÊÒ²Ö´Ù£Ô¼ 36.6%¡£¡£¡£¡£¡£¡£
ÄÇô£¬£¬£¬£¬£¬£¬£¬£¬¡°Òþʽ¶¨Î»¡±¾¿¾¹ÊÇʲôÒâ˼£¿£¿£¿£¿£¿£¿£¿ÏêϸÓÖ¸ÃÔõÑùʵÏÖ£¿£¿£¿£¿£¿£¿£¿¶¡ÅôÏè¸øÁË DeepTech Àà±È£ºÈËÀàÔÚÖ´ÐÐϸÄå²Ù×÷ʱ£¬£¬£¬£¬£¬£¬£¬£¬ËäÈ»¿´µ½µÄÊÇÕû¸ö³¡¾°£¬£¬£¬£¬£¬£¬£¬£¬µ«ÕæÕý¾Û½¹µÄÖ»ÓÐÒ»Ð¡Æ¬ÇøÓò¡£¡£¡£¡£¡£¡£ÈôÊÇÖ¸ÁîÊÇ¡°Äñ×Ó¡±£¬£¬£¬£¬£¬£¬£¬£¬ÄÄÅÂ×ÀÉÏ·ÅÁËÊ®Ñù¹¤¾ß£¬£¬£¬£¬£¬£¬£¬£¬ÈËÀàµÄÊÓ¾õ½¹µã»á×Ô¶¯Ëø¶¨ÔÚ±×ÓÉÏ£¬£¬£¬£¬£¬£¬£¬£¬ÖÜΧµÄÒ»Çж¼±äµÃÄ£ºý¡£¡£¡£¡£¡£¡£ÕâÖÖÐÐΪÔÚÊÓ¾õ¿ÆÑ§ÖнÐ×ö¡°×¢ÊÓ¡±£¨gaze£©¡£¡£¡£¡£¡£¡£
ReconVLA ½è¼øÁËÕâÒ»»úÖÆ¡£¡£¡£¡£¡£¡£ÔÚѵÁ·½×¶Î£¬£¬£¬£¬£¬£¬£¬£¬³ýÁËͨÀýµÄÐж¯Õ¹ÍûËðʧ֮Í⣬£¬£¬£¬£¬£¬£¬£¬Ä£×Ó»¹ÐèÒªÍê³ÉÒ»¸ö¸¨ÖúʹÃü£ºÖØÐÞÄ¿½ñͼÏñÖÐÓë²Ù×÷Ä¿µÄ¶ÔÓ¦µÄÇøÓò£¬£¬£¬£¬£¬£¬£¬£¬¼´ËùνµÄ¡°×¢ÊÓÇøÓò¡±£¨gaze region£©¡£¡£¡£¡£¡£¡£
£¨ÈªÔ´£ºÂÛÎÄ£©
ÏêϸÀ´Ëµ£¬£¬£¬£¬£¬£¬£¬£¬Ä£×ÓµÄÊÓ¾õÊä³ö token£¨³ÆÎª"ÖØÐÞ token"£¬£¬£¬£¬£¬£¬£¬£¬reconstructive token£©»á±»ÊäÈëµ½Ò»¸öÇáÁ¿¼¶µÄÀ©É¢±äѹÆ÷£¨diffusion transformer£©ÖУ¬£¬£¬£¬£¬£¬£¬£¬¸ÃÀ©É¢Ä£¿£¿£¿£¿£¿£¿£¿éµÄÄ¿µÄÊÇ´ÓÔëÉùÖлָ´³ö×¢ÊÓÇøÓòµÄÊÓ¾õÌØÕ÷¡£¡£¡£¡£¡£¡£ÈôÊÇÄ£×ÓÔÚ±àÂë½×¶ÎûÓаÑ×¢ÖØÁ¦·ÅÔÚÄ¿µÄÇøÓòÉÏ£¬£¬£¬£¬£¬£¬£¬£¬ËüÊä³öµÄÖØÐÞ token ¾Í²»»á°üÀ¨×ã¹»µÄϸÁ£¶ÈÐÅÏ¢£¬£¬£¬£¬£¬£¬£¬£¬À©É¢Ä£¿£¿£¿£¿£¿£¿£¿é¾ÍÎÞ·¨Íê³ÉÖØÐÞ£¬£¬£¬£¬£¬£¬£¬£¬Ëðʧº¯Êý¾Í»á´¦·ÖËü¡£¡£¡£¡£¡£¡£
ÕâÐγÉÁËÒ»¸öÁ÷ͨµÄ·´Ïì»ØÂ·£ºÏëÒªÍê³ÉÖØÐÞ ¡ú ±ØÐè¹Ø×¢Ä¿µÄ ¡ú ¹Ø×¢Ä¿µÄºóÊÓ¾õ±íÕ÷¸ü׼ȷ ¡ú Ðж¯Õ¹Íû¸ü×¼¡£¡£¡£¡£¡£¡£Õû¸öÀú³ÌÖУ¬£¬£¬£¬£¬£¬£¬£¬Ã»ÓÐÈκÎÏÔʽµÄ½çÏß¿òÊä³ö£¬£¬£¬£¬£¬£¬£¬£¬Ò²Ã»ÓÐÍⲿ¼ì²âÄ£×Ó¼ÓÈëÍÆÀí¡£¡£¡£¡£¡£¡£ÖØÐÞÄ£¿£¿£¿£¿£¿£¿£¿éÖ»ÔÚѵÁ·Ê±±£´æ£¬£¬£¬£¬£¬£¬£¬£¬ÍÆÀíʱ±»ÍêÈ«ÒÆ³ý¡£¡£¡£¡£¡£¡£ÕâÒâζ×Å ReconVLA ÔÚ°²ÅŽ׶εÄÍÆÀíËÙÂÊÓëͨÀý VLA Ä£×ÓÍêȫһÖ£¬£¬£¬£¬£¬£¬£¬£¬²»ÒýÈëÈκÎÌØÊâÑÓ³Ù¡£¡£¡£¡£¡£¡£
ÕâºÍ´ËǰµÄÊÓ¾õ¶¨Î»ÒªÁìÓÐʲô²î±ð£¿£¿£¿£¿£¿£¿£¿
´Ëǰ£¬£¬£¬£¬£¬£¬£¬£¬ÊÓ¾õ¶¨Î»Ö÷ÒªÒÀÀµÓÚÁ½ÖÖ·¶Ê½¡£¡£¡£¡£¡£¡£Ò»ÖÖÊÇ"ÏÔʽ¶¨Î»"£¨Explicit Grounding£©£¬£¬£¬£¬£¬£¬£¬£¬ºÃ±È RoboGround ºÍ VIP µÈÊÂÇ飬£¬£¬£¬£¬£¬£¬£¬ËüÃÇʹÓÃÍⲿ¼ì²âÄ£×Ó£¨Èç YOLO »ò LISA£©ÏȰÑÄ¿µÄÎïÎÄÌå¼ô³öÀ´£¬£¬£¬£¬£¬£¬£¬£¬ÔٰѲüôͼÏñºÍÔͼһÆðÊäÈë VLA¡£¡£¡£¡£¡£¡£ÕâÖÖÒªÁìȷʵÌṩÁ˸ü¾Û½¹µÄÊÓ¾õÐÅÏ¢£¬£¬£¬£¬£¬£¬£¬£¬µ«ËüÒÀÀµÍⲿģ×ӵľ«¶È£¬£¬£¬£¬£¬£¬£¬£¬ÇÒÁ½ÕÅͼÏñµÄ¼òÆÓÆ´½ÓÒýÈëÁËÐÅÏ¢ÈßÓà¡£¡£¡£¡£¡£¡£
ÁíÒ»ÖÖÊÇ"Í·ÄÔÁ´¶¨Î»"£¨CoT Grounding£©£¬£¬£¬£¬£¬£¬£¬£¬Èç ECoT ºÍ GraspVLA£¬£¬£¬£¬£¬£¬£¬£¬ÈÃÄ£×ÓÏÈÊä³öÄ¿µÄµÄ½çÏß¿ò×ø±ê£¬£¬£¬£¬£¬£¬£¬£¬ÔÙÊä³öÐж¯¡£¡£¡£¡£¡£¡£ÕâÖÖ·½·¨ÔÚÀíÂÛÉÏºÜÆ¯ÁÁ£¬£¬£¬£¬£¬£¬£¬£¬µ«ÊµÑéЧ¹ûÏÔʾËüÉõÖÁ²»Èç»ùÏß¡£¡£¡£¡£¡£¡£ÔÚ CALVIN »ù×¼²âÊÔÖУ¬£¬£¬£¬£¬£¬£¬£¬CoT ·½·¨µÄ 5 ²½Ò»Á¬Ê¹ÃüÀÖ³ÉÂÊÏÕЩΪÁã¡£¡£¡£¡£¡£¡£Ôµ¹ÊÔÓÉ¿ÉÄÜÔÚÓÚ£¬£¬£¬£¬£¬£¬£¬£¬×ø±êÐÎʽµÄ¶¨Î»ÐÅÏ¢¶Ô VLA Ä£×ÓÀ´Ëµ²¢²»ÊÇÒ»ÖÖ¸ßЧµÄÖ¸µ¼Ðźţ¬£¬£¬£¬£¬£¬£¬£¬Í¬Ê±ÒªÊä³ö×¼È·×ø±êºÍ׼ȷÐж¯Öµ£¬£¬£¬£¬£¬£¬£¬£¬¸øÑµÁ·´øÀ´ÁËÌØÊâ¼ç¸º¡£¡£¡£¡£¡£¡£
ͼ | ²î±ð·¶Ê½Ö®¼äµÄ¿´·¨½ÏÁ¿£¨ÈªÔ´£ºÂÛÎÄ£©
Ïà±È֮ϣ¬£¬£¬£¬£¬£¬£¬£¬ReconVLA µÄÒþʽ¶¨Î»ÔÚͳһ»ù×¼ÉÏÈ¡µÃÁË×î¸ßЧ¹û¡£¡£¡£¡£¡£¡£
ÔÚ CALVIN ABC¡úD ²âÊÔ£¨ÒªÇóÄ£×ÓÔÚδ¼û¹ýµÄÇéÐÎ D ÖÐÖ´ÐÐ 5 ²½Ò»Á¬Ê¹Ãü£©ÖУ¬£¬£¬£¬£¬£¬£¬£¬ReconVLA ÔÚµÚ 5 ¸ö×ÓʹÃüÉϵִïÁË 64.1% µÄÀÖ³ÉÂÊ£¬£¬£¬£¬£¬£¬£¬£¬¶ø»ùÏßÄ£×ÓΪ 49.0%£¬£¬£¬£¬£¬£¬£¬£¬ÏÔʽ¶¨Î»ÒªÁìΪ 50.2%£¬£¬£¬£¬£¬£¬£¬£¬ÌáÉýÔ¼ 15 ¸ö°Ù·Öµã¡£¡£¡£¡£¡£¡£ÔÚ¸ü¾ßÌôÕ½ÐÔµÄϸÄå²Ù×÷ʹÃü"»ýľ¶Ñµþ"£¨stack block£©ÖУ¬£¬£¬£¬£¬£¬£¬£¬»ùÏßÀÖ³ÉÂʽö 59.3%£¬£¬£¬£¬£¬£¬£¬£¬ReconVLA µÖ´ï 79.5%£¬£¬£¬£¬£¬£¬£¬£¬ÌáÉýÁè¼Ý 20 ¸ö°Ù·Öµã¡£¡£¡£¡£¡£¡£
ͼ | ²î±ð·¶Ê½Ö®¼äµÄ²âÊÔ·ÖÊý±ÈÕÕ£¨ÈªÔ´£ºÂÛÎÄ£©
ΪÁËÈÃÖØÐÞÄÜÁ¦¾ß±¸·º»¯ÐÔ£¬£¬£¬£¬£¬£¬£¬£¬ÍŶӻ¹¹¹½¨ÁËÒ»¸ö´ó¹æÄ£Ô¤ÑµÁ·Êý¾Ý¼¯£¬£¬£¬£¬£¬£¬£¬£¬°üÀ¨Áè¼Ý 10 ÍòÌõ»úеÈ˲Ù×÷¹ì¼£ºÍ 200 Íò¸öÊý¾ÝÑù±¾¡£¡£¡£¡£¡£¡£Êý¾ÝȪԴ°üÀ¨¿ªÔ´µÄ BridgeData V2 ÒÔ¼° LIBERO¡¢CALVIN Á½¸ö·ÂÕæÇéÐÎÊý¾Ý¼¯¡£¡£¡£¡£¡£¡£
×¢ÊÓÇøÓòµÄ±ê×¢½èÖúÁË Grounding DINO ÕâÒ»¿ª·Å´Ê»ã¼ì²âÄ£×Ó£¨open-vocabulary detector£©£¬£¬£¬£¬£¬£¬£¬£¬´ó²¿·ÖÊý¾Ý¿ÉÒÔͨ¹ýÁãÑù±¾£¨zero-shot£©·½·¨Ö±½Ó±ê×¢£¬£¬£¬£¬£¬£¬£¬£¬¹ØÓÚ»úеÈ˳¡¾°ÖÐһЩ½ÏΪÓÐÊý»òÖØ´óµÄÎïÌ壬£¬£¬£¬£¬£¬£¬£¬ÍŶÓÔò¾ÙÐÐÁ˶¨ÖÆ»¯Î¢µ÷¡£¡£¡£¡£¡£¡£ÏûÈÚʵÑé֤ʵ£¬£¬£¬£¬£¬£¬£¬£¬Ô¤ÑµÁ·½×¶Î¶Ô·º»¯ÄÜÁ¦µÄÌáÉýÊÇÏÔÖøµÄ¡£¡£¡£¡£¡£¡£ÒƳýԤѵÁ·ºó£¬£¬£¬£¬£¬£¬£¬£¬5 ²½Ò»Á¬Ê¹ÃüµÄ×îÖÕÀÖ³ÉÂÊ´Ó 64.1% ϽµÖÁ 58.2%¡£¡£¡£¡£¡£¡£
ÕæÊµÌìϵÄʵÑé½øÒ»²½ÑéÖ¤ÁËÕâÒ»ÒªÁìµÄ¿ÉÐÐÐÔ¡£¡£¡£¡£¡£¡£
ÍŶÓʹÓÃһ̨ 6 ×ÔÓÉ¶ÈµÄ AgileX PiPer »úе±Û£¬£¬£¬£¬£¬£¬£¬£¬ÅäºÏÁ½¸öÉî¶ÈÏà»ú£¨»®·Ö×÷Ϊ»ù×ùÊӽǺÍÊÖ²¿Êӽǣ©£¬£¬£¬£¬£¬£¬£¬£¬ÔÚËĸö´ú±íÐÔʹÃüÉϾÙÐÐÁ˲âÊÔ£º½«Ë®¹û·ÅÈëÍëÖС¢µþÍë¡¢·±×Ó¡¢ÕûÀí×ÀÃæ¡£¡£¡£¡£¡£¡£ÔÚÿ¸öʹÃüÖУ¬£¬£¬£¬£¬£¬£¬£¬ReconVLA ¶¼È¡µÃÁË×î¸ßÀÖ³ÉÂÊ¡£¡£¡£¡£¡£¡£
ÌØÊâÖµµÃ×¢ÖØµÄÊÇÔÚ¡°Î´¼ûÎïÌ塱£¨unseen objects£©µÄ²âÊÔÖУ¬£¬£¬£¬£¬£¬£¬£¬µ±Ä¿µÄÎïÌå²»ÔÚѵÁ·Êý¾ÝÖÐʱ£¬£¬£¬£¬£¬£¬£¬£¬±ÈÕÕÒªÁì OpenVLA ºÍ PD-VLA µÄÀÖ³ÉÂÊ¿¿½üÁ㣬£¬£¬£¬£¬£¬£¬£¬¶ø ReconVLA ÈÔÄÜÀֳɶ¨Î»Ä¿µÄ²¢Íê³É²Ù×÷£¬£¬£¬£¬£¬£¬£¬£¬Õ¹ÏÖ³öÆäÊÓ¾õ·º»¯ÄÜÁ¦¡£¡£¡£¡£¡£¡£
ͼ | Ëĸö´ú±íÐÔʹÃüµÄÕæÊµÌìÏÂÉèÖã¨ÈªÔ´£ºÂÛÎÄ£©
ËäÈ»£¬£¬£¬£¬£¬£¬£¬£¬ÈκÎÒªÁì¶¼²»ÊÇÍêÉÆµÄ¡£¡£¡£¡£¡£¡£ËÎÎÄÐùÏò DeepTech ̹ÑÔ£¬£¬£¬£¬£¬£¬£¬£¬ReconVLA µÄÖ÷ÒªÌØÊⱾǮÔÚѵÁ·½×¶Î¡ª¡ªÒýÈëÖØÐÞÄ¿µÄÒâζןü¶àµÄÅÌË㿪Ïú£¬£¬£¬£¬£¬£¬£¬£¬Ö»¹ÜÍŶÓÒѾ¶ÔÀ©É¢Ä£¿£¿£¿£¿£¿£¿£¿é×öÁËÇáÁ¿»¯Éè¼ÆÀ´¿ØÖÆÕⲿ·ÖÏûºÄ¡£¡£¡£¡£¡£¡£¶¡ÅôÏèÖ¸³öÁËÁíÒ»²ã¾ÖÏÞ£ºÄ¿½ñ½¨Ä£ÈÔÈ»Ö÷Òª»ùÓÚ¶þάÊÓ¾õ¿Õ¼ä£¬£¬£¬£¬£¬£¬£¬£¬ÔÚÐèÒªÉî¶ÈÐÅÏ¢ºÍÈýά¼¸ºÎÔ¼ÊøµÄ¸ß¾«¶ÈʹÃüÖУ¬£¬£¬£¬£¬£¬£¬£¬¼´±ã¶þά¶¨Î»Ô½·¢×¼È·£¬£¬£¬£¬£¬£¬£¬£¬¿Õ¼ä²Ù×÷¾«¶ÈÈÔÈ»¿ÉÄÜÊÜÏÞ¡£¡£¡£¡£¡£¡£
ÍŶÓ͸¶£¬£¬£¬£¬£¬£¬£¬£¬ËûÃÇÒÑÔÚºóÐøÊÂÇéÖÐ×îÏÈ̽Ë÷Èýά¸ÐÖª½¨Ä££¨3D-aware modeling£©£¬£¬£¬£¬£¬£¬£¬£¬Ïà¹ØÐ§¹ûÒÑÌá½»ÖÁ½üÆÚµÄѧÊõ¾Û»á¡£¡£¡£¡£¡£¡£±ðµÄ£¬£¬£¬£¬£¬£¬£¬£¬Á¦¾õ¸ÐÖªºÍÁ¦¿ØÐźŵȶàģ̬ÐÅÏ¢ÏÖÔÚÒ²ÉÐδÄÉÈë¿ò¼Ü£¬£¬£¬£¬£¬£¬£¬£¬µ«´ÓÒªÁì½á¹¹ÉÏ¿´£¬£¬£¬£¬£¬£¬£¬£¬ÕâЩģ̬ÍêÈ«¿ÉÒÔͨ¹ýͬÑùµÄÒþʽ½¨Ä£»£»£»£»£»£»úÖÆÕûºÏ½øÀ´¡£¡£¡£¡£¡£¡£
̸µ½¾ßÉíÖÇÄܵÄÂäµØÔ¶¾°£¬£¬£¬£¬£¬£¬£¬£¬¶¡ÅôÏèµÄ¿´·¨ÆÄΪÎñʵ¡£¡£¡£¡£¡£¡£ËûÒÔΪ VLA ²»±Ø¼±ÓÚÂ䵨µ½Ä³Ò»¸öÏêϸµÄ±ÊÖ±³¡¾°²ÅËãÓмÛÖµ¡£¡£¡£¡£¡£¡£Àà±ÈÔçÆÚµÄ ChatGPT£¬£¬£¬£¬£¬£¬£¬£¬GPT-3 Ðû²¼Ê±²¢Ã»ÓÐÁ¬Ã¦Ç¶Èëij¸öÌØ¶¨ÐÐÒµÁ÷³Ì£¬£¬£¬£¬£¬£¬£¬£¬µ«ËüÏÔÖø¸Ä±äÁËд×÷ºÍÄÚÈÝ´´×÷µÄЧÂÊ¡£¡£¡£¡£¡£¡£
VLA µÄ¼ÛÖµ¿ÉÄÜÒ²»áÂÄÀúÀàËÆµÄ¡°Á½²½×ß¡±¡£¡£¡£¡£¡£¡£µÚÒ»²½ÊǽµµÍ°²Åű¾Ç®¡£¡£¡£¡£¡£¡£ÒÑÍùÿ¸ö¹¤³§Ê¹Ãü¶¼ÐèÒª×ÔÁ¦½¨Ä££¬£¬£¬£¬£¬£¬£¬£¬ÈôÊÇÓÐÒ»¸ö×㹻ǿµÄ»ù´¡Ä£×Ó£¬£¬£¬£¬£¬£¬£¬£¬ÆóÒµÖ»ÐèÉÙÁ¿Î¢µ÷¾ÍÄÜÍê³ÉÊÊÅ䣻£»£»£»£»£»µÚ¶þ²½²ÅÊÇÍŽá Agent ϵͳ¹¹½¨Ïêϸ³¡¾°µÄ±Õ»·ÊÂÇéÁ÷¡£¡£¡£¡£¡£¡£
Ëû»¹Ôö²¹Ëµ£¬£¬£¬£¬£¬£¬£¬£¬ËûÃÇÔø½«¾ÓÉͨÓÃѵÁ·µÄÄ£×Ó°²Åŵ½ÏÖʵ¹¤ÒµÇéÐÎÖвâÊÔÅ¡ÂÝË¿¡¢²å½ÓÁ㲿¼þµÈʹÃü£¬£¬£¬£¬£¬£¬£¬£¬Ð§¹ûÏÔʾֻҪ»ù×ùÄ£×Ó×ã¹»Îȹ̣¬£¬£¬£¬£¬£¬£¬£¬ÏÂÓÎʹÃüÐÔÄܾͻáÏÔÖøÌáÉý¡£¡£¡£¡£¡£¡£ËûÒÔΪ¶ÌÖÐÆÚ¸ü¾ßDZÁ¦µÄ³¡¾°°üÀ¨°ë½á¹¹»¯¹¤Òµ×°Åä¡¢ÇṤҵϸÄå²Ù×÷ÒÔ¼°ÉÌҵЧÀÍ»úеÈË£¨ÈçÒûÆ·ÖÆ×÷£©¡£¡£¡£¡£¡£¡£ÕâЩ³¡¾°µÄÅäºÏÌØµãÊDzÙ×÷Á´ÌõÃ÷È·¡¢¾«¶ÈÒªÇó¸ß¡¢¶ÔÖØ¸´ÐÔÎȹÌÐÔÓиÕÐè¡£¡£¡£¡£¡£¡£
³ýÑо¿Í⣬£¬£¬£¬£¬£¬£¬£¬ÕâÖ§ÍŶӻ¹ÅäºÏ½¨ÉèÁËÒ»¸öÃûΪ OpenHelix µÄ¿ªÔ´ÉçÇø£¬£¬£¬£¬£¬£¬£¬£¬ÏÖÔÚÒÑÒ»Á¬¿ªÔ´Ê®Óà¸öÏîÄ¿£¬£¬£¬£¬£¬£¬£¬£¬ÀۼƻñµÃÔ¼ 3,600 ¸ö GitHub ÐDZꡣ¡£¡£¡£¡£¡£ÔÚ×ÊÔ´ÓÐÏÞµÄÌõ¼þÏ£¬£¬£¬£¬£¬£¬£¬£¬ËûÃÇÑ¡ÔñÁËÒ»Ìõ"¸ßЧÓë¾Û½¹"µÄõè¾¶¡£¡£¡£¡£¡£¡£²»×·ÇóÊý°ÙÕÅ GPU µÄ´ó¹æÄ£ÑµÁ·ºÍ¸ß¶È¹¤³Ì»¯µÄÑÝʾ£¬£¬£¬£¬£¬£¬£¬£¬¶øÊÇרעÓÚ¾ßÓÐÒªÁìÂÛ¶´¼ûµÄÑо¿Æ«Ïò¡£¡£¡£¡£¡£¡£
ËûÃÇÐÅÍУ¬£¬£¬£¬£¬£¬£¬£¬Ö»ÓÐͨ¹ý¿ª·Å¹²Ïí£¬£¬£¬£¬£¬£¬£¬£¬Ñо¿Ð§¹û²Å»ªÕæÕýÂ䵨µ½¸ü¶à´ÓÒµÕßÊÖÖС£¡£¡£¡£¡£¡£³ýÁË ReconVLA µÄºóÐøµü´ú£¬£¬£¬£¬£¬£¬£¬£¬ÍŶӻ¹ÔÚÍÆ½ø´¥¾õÓëÁ¦·´Ï졢˫±ÛÐ×÷µÈÆ«ÏòµÄÑо¿£¬£¬£¬£¬£¬£¬£¬£¬Ä¿µÄÊÇÍØ¿í VLA µÄÄÜÁ¦½çÏߣ¬£¬£¬£¬£¬£¬£¬£¬¶ø²»µ«½öÍ£ÁôÔÚ¼òµ¥Õ¹Ê¾ÐÍÓ¦ÓÃÉÏ¡£¡£¡£¡£¡£¡£
https://arxiv.org/html/2508.10333v1
ÔËÓª/ÅŰ棺ºÎ³¿Áú
¼ÓÄôópc28Õ¹ÍûßäÅÆ×îÐÂЧ¹û|Õ¹ÍûßäÅÆ