
?¼ÓÄôóÕ¹Íûpc2.8Õ¹ÍûÍø?ΪÄãÌṩ¼ÓÄôóÕ¹Íûpc2.8Õ¹ÍûÍøAPP°²×¿°æÏÂÔØ£¬£¬£¬£¬£¬£¬£¬£¬ÀúÊ·°æ±¾¡¢¾É°æÏÂÔØ£¬£¬£¬£¬£¬£¬£¬£¬Éó²é×îмÓÄôóÕ¹Íûpc2.8Õ¹ÍûÍøÊÖ»ú°æÏÈÈÝ¡¢Ó¦ÓýØÍ¼¡¢ÍøÓÑ̸ÂÛ£¬£¬£¬£¬£¬£¬£¬£¬Àû±ã¿ì½ÝµÄ½«°²×¿°æ¼ÓÄôóÕ¹Íûpc2.8Õ¹ÍûÍøÓ¦ÓÃÃâ·ÑÏÂÔØµ½ÊÖ»ú¡£¡£¡£¡£¡£¡£
ÊÇÒ»¿î»ªÃÀÈÈѪµÄÐж¯Éä»÷ÀàÓÎÏ·Ó¦Ó㬣¬£¬£¬£¬£¬£¬£¬ÌÚѶÇå¾²¾«Ó¢¹Ù·½°æÈÃÍæ¼Ò¿ÉÒÔÕùÏÈÌåÑéÇå¾²¾«Ó¢ÊÖÓÎ·×ÆçÑùµÄÍŶӾº¼¼Ä£Ê½£¬£¬£¬£¬£¬£¬£¬£¬ÊµÊ±ÓïÒô¿ªºÚ£¬£¬£¬£¬£¬£¬£¬£¬ÒÀ¸½Õ½ÊõÕ½ÂÔÒÔ¼°Éä»÷ÄÜÁ¦»ñȡʤÀû¡£¡£¡£¡£¡£¡£
ÊÇÒ»¿îµ±ÏÂÊ®·Ö»ðÈȵÄäºÐÌìÏÂÐÝÏоº¼¼ÊÖÓΣ¬£¬£¬£¬£¬£¬£¬£¬3d¿É°®¿¨Í¨×÷Õ½Çé¾°£¬£¬£¬£¬£¬£¬£¬£¬¶àÖÖ¶àÑùµÄ½ÇÉ«ÌôÑ¡£¡£¡£¡£¡£¡£¬£¬£¬£¬£¬£¬£¬£¬Ìôսģʽ°üÀ¨1000¸öÒ»Ö±µÝ½øµÄ¹Ø¿¨£¬£¬£¬£¬£¬£¬£¬£¬¸ÐÐËȤµÄÅóÙ½Ó´ýǰÀ´ÏÂÔØÌåÑé¡£¡£¡£¡£¡£¡£
ÊÇÒ»¿îÉÌսģÄâµÄ½ÇÉ«ÊÎÑÝÊÖÓΣ¬£¬£¬£¬£¬£¬£¬£¬ÄãÒªÊÎÑݵØÇòµÄ¸»ºÀÓëһЩÍâÌ«¿ÕµÄÉÌҵʵÁ¦×÷¶·Õù£¬£¬£¬£¬£¬£¬£¬£¬±£»£»£»£»£»£»¤µØÇòµÄ¿î×ÓºÍ×ÊÔ´£¬£¬£¬£¬£¬£¬£¬£¬¿ìÀ´ÊÔÊÔ°É¡£¡£¡£¡£¡£¡£
ÊÇÒ»¿îÃâ·ÑµÄÌîÉ«ÓÎÏ·£¬£¬£¬£¬£¬£¬£¬£¬Íæ¼Ò¿ÉÒÔÑ¡Ôñϲ»¶µÄͼ»¾ÙÐÐÉÏÉ«£¬£¬£¬£¬£¬£¬£¬£¬ÕâÄÚÀïÓµÓк£Á¿µÄͼƬ£¬£¬£¬£¬£¬£¬£¬£¬²¢ÇÒÆø¸Å¸÷Ñù£¬£¬£¬£¬£¬£¬£¬£¬Ö»ÐèҪƾ֤¶ÔÓ¦µÄÊý×ÖÉÏÉ«¼´¿É£¬£¬£¬£¬£¬£¬£¬£¬ÓÎÏ·ÍæÆðÀ´ºÜÊǵÄÖÎÓú£¬£¬£¬£¬£¬£¬£¬£¬×îºó¿´µ½Ð§¹ûÒ²ºÜÓгɼ¨¸Ð£¬£¬£¬£¬£¬£¬£¬£¬¸ÐÐËȤµÄÅóÙ¸ÏæÀ´ÏÂÔØ°É£¡
ÊÇÒ»¿îǹս³Ô¼¦ÓÎÏ·¡£¡£¡£¡£¡£¡£Äú½«ÊÖ³ÖÇ¹Ö§ÌøÉ¡½øÈëÁÉÀ«µÄÕ½³¡£¡£¡£¡£¡£¡£¬£¬£¬£¬£¬£¬£¬£¬²¢×îÏÈÄúµÄǹս´óÌÓɱÓÎÏ·ºÍÉúÑÄʹÃü£¬£¬£¬£¬£¬£¬£¬£¬ÔÚÓÎÏ·ÖУ¬£¬£¬£¬£¬£¬£¬£¬Íæ¼Ò½«ÃæÁÙÖÖÖÖÄ¥Á·£¬£¬£¬£¬£¬£¬£¬£¬ÐèÒªÒ»Ö±ÌáÉý×Ô¼ºµÄÉúÑÄÄÜÁ¦£¬£¬£¬£¬£¬£¬£¬£¬²Å»ªÔÚÕâ¸ö²Ð¿áµÄÕ½³¡ÖÐÉúÑÄÏÂÈ¥¡£¡£¡£¡£¡£¡£
ÊÇÒ»¿î3DÊÖÓΣ¬£¬£¬£¬£¬£¬£¬£¬ÓÎÏ·ÓµÓоµäµÄ½ÇÉ«£¬£¬£¬£¬£¬£¬£¬£¬¶àÑùµÄÍæ·¨£¬£¬£¬£¬£¬£¬£¬£¬Ï¸ÄåµÄ»Ã棬£¬£¬£¬£¬£¬£¬£¬´Ì¼¤ÓÐȤ£¬£¬£¬£¬£¬£¬£¬£¬¿ÉÒÔ¸øÓèÍæ¼Ò³¬°ôµÄÓÎÏ·ÌåÑ飬£¬£¬£¬£¬£¬£¬£¬½Ó´ýǰÀ´ÏÂÔØË¬Íæ¡£¡£¡£¡£¡£¡£
![]()
ÐÂÖÇÔª±¨µÀ
±à¼£ºLRST
¡¾ÐÂÖÇÔªµ¼¶Á¡¿¶àÖÇÄÜÌåAIϵͳÐèÒªÃ÷È·µÄ¹²Ê¶»úÖÆÀ´Ðµ÷²î±ðAIÖ÷ÌåµÄ¾öÒé¡£¡£¡£¡£¡£¡£ÐÂÀíÂÛ¿ò¼Ü½«¶àÖÇÄÜÌåÍÆÀí½¨Ä£ÎªÂþÑÜʽ¹²Ê¶Àú³Ì£¬£¬£¬£¬£¬£¬£¬£¬´ó·ùÌáÉýϵͳÐÔÄÜ£¬£¬£¬£¬£¬£¬£¬£¬½µµÍÑÓ³ÙºÍÅÌË㱾Ǯ£¬£¬£¬£¬£¬£¬£¬£¬Ê¹¶àÖÇÄÜÌåAI´ÓʵÑé½×¶ÎÂõÏòÏÖʵӦÓᣡ£¡£¡£¡£¡£
ÒÑÍùÒ»Ä꣬£¬£¬£¬£¬£¬£¬£¬LLM AgentÏÕЩ³ÉΪËùÓÐ AI Ñо¿ÍŶÓÓ빤ҵ½çµÄÅäºÏÆ«Ïò¡£¡£¡£¡£¡£¡£
OpenAIÔÚÒ»Á¬Íƽø¸üÇ¿µÄÍÆÀíÓ빤¾ßʹÓÃÄÜÁ¦£¬£¬£¬£¬£¬£¬£¬£¬Google DeepMind½«ÍÆÀíÏÔʽ½¨Ä£ÎªËÑË÷ÎÊÌ⣬£¬£¬£¬£¬£¬£¬£¬AnthropicÔòͨ¹ý¹æ·¶Óë×ÔÎÒÅúÅÐÌáÉýÄ£×Ó¿É¿¿ÐÔ¡£¡£¡£¡£¡£¡£
Ò»¸öºÜÊÇÇåÎúµÄÐÐÒµÇ÷ÊÆÕýÔÚÐγɣºµ¥Ä£×ÓÄÜÁ¦ÕýÔÚ¿¿½ü½á¹¹ÐÔ½çÏߣ¬£¬£¬£¬£¬£¬£¬£¬¶àÖÇÄÜÌå±»ÊÓΪÏÂÒ»²½¡£¡£¡£¡£¡£¡£
Advaita Research/HetuÍŽáÊ×´´ÈËJialin LiÐû²¼µÄ×îÐÂÑо¿ÂÛÎÄ£¬£¬£¬£¬£¬£¬£¬£¬Îª¶àÖÇÄÜÌåÐ×÷¹²Ê¶Ìá³öÁËÃ÷È·µÄÀíÂÛ¿ò¼Ü£¬£¬£¬£¬£¬£¬£¬£¬²¢¸ø³öÁËÒ»×éÉú²ú¼¶ÏµÍ³Ö¸±êµÄԾǨʽ¸ÄÉÆ£ºÔÚaccuracy»ù±¾Îȹ̵ÄÌõ¼þÏ£¬£¬£¬£¬£¬£¬£¬£¬ÊµÏÖ×î¸ß20¡Á¶Ëµ½¶ËÑÓ³ÙϽµ£¬£¬£¬£¬£¬£¬£¬£¬×î¸ß11¡ÁµÄP99βÑÓ³Ù¸ÄÉÆ£¬£¬£¬£¬£¬£¬£¬£¬ÒÔ¼°×î¸ß4.4¡ÁµÄtoken±¾Ç®Ï÷¼õ¡£¡£¡£¡£¡£¡£
![]()
ÂÛÎÄÁ´½Ó£ºhttps://arxiv.org/pdf/2512.20184
Ó¢ÎİæÁ´½Ó£ºhttps://x.com/advaita_labs/status/2018576622048473241
ÕâÏîÊÂÇ齫¶àÖÇÄÜÌåÍÆÀíµÄÎÊÌ⣬£¬£¬£¬£¬£¬£¬£¬´ÓpromptÓëworkflowÉè¼Æ£¬£¬£¬£¬£¬£¬£¬£¬ÖØÐÂÀ»Øµ½ÏµÍ³Éè¼ÆºÍ¹¤³Ì²ãÃæ£ºÒ»ÖÂÐÔÓïÒå¡¢×èÖ¹Ìõ¼þÓëβÑÓ³ÙÖÎÀí¡£¡£¡£¡£¡£¡£
ÔÚ¹¤³ÌÓï¾³ÖУ¬£¬£¬£¬£¬£¬£¬£¬ÂÛÎĸø³öµÄ½¹µãÅжϿÉÒÔ¹éÄÉ×ÛºÏΪһ¾ä»°£ºÄ¿½ñ¶àÖÇÄÜϵһÇУ¬£¬£¬£¬£¬£¬£¬£¬È±·¦Ò»Ì×Ã÷È·µÄAgentic Consensus£¨ÖÇÄÜÌ干ʶ£©ÏµÍ³ÓïÒå¡£¡£¡£¡£¡£¡£
Advaita Research / Hetu CMO Stephanie Yu´Óϵͳ¹¤³ÌÊӽǶÔÂÛÎľÙÐÐÏàʶ¶Á¡£¡£¡£¡£¡£¡£
Ñо¿Åä¾°
ÔÚÄ¿½ñÖ÷Á÷õè¾¶ÖУ¬£¬£¬£¬£¬£¬£¬£¬´óÐÍÑо¿»ú¹¹¶ÔAgentµÄ̽Ë÷´óÖ¿ÉÒÔ·ÖΪÈýÀ࣬£¬£¬£¬£¬£¬£¬£¬µ«ËüÃÇÔÚÒ»¸öÒªº¦ÎÊÌâÉϼá³ÖÁËÅäºÏµÄĬȻ£ºµ±¶à¸öËæ»úÍÆÀíÖ÷Ìå²¢ÐÐÊÂÇéʱ£¬£¬£¬£¬£¬£¬£¬£¬ÏµÍ³ºÎʱ¿ÉÒÔÒÔΪÒѾ¸æ¿¢ÎȹÌÒ»Ö£¿£¿£¿£¿£¿£¿
OpenAI£ºÇ¿»¯µ¥Ö÷ÌåÍÆÀíÄÜÁ¦
OpenAIµÄõ辶ʼÖÕÎ§ÈÆtest-time scaling£¬£¬£¬£¬£¬£¬£¬£¬°üÀ¨self-consistency¡¢¶à·¾¶ÍÆÀí¡¢¸üÇ¿µÄ chain-of-thought¡¢¸ü³ÉÊìµÄ tool use¡£¡£¡£¡£¡£¡£
¸ÃϵͳÔÚµ¥Ö÷ÌåÌõ¼þϾßÓкÜÊÇÇåÎúµÄ¹¤³ÌÓÅÊÆ£ºÍÆÀíÖÊÁ¿¸ß¶È¿É¿Ø¡¢ÐÐΪһÖÂÐÔÇ¿¡¢¹¤³ÌÖØÆ¯ºó¼¯ÖС£¡£¡£¡£¡£¡£
ÆäÒþº¬Ìõ¼þͬÑùÃ÷È·£ºÏµÍ³Ö»ÓÐÒ»¸ö¾öÒéÖ÷Ìå¡£¡£¡£¡£¡£¡£
Ò»µ©À©Õ¹Îª¶à¸öplanner¡¢¶à¸öactor²¢ÐÐÖ´ÐУ¬£¬£¬£¬£¬£¬£¬£¬Ò»ÖÂÐÔ²»ÔÙÓÉÄ£×ÓÄÚ²¿°ü¹Ü£¬£¬£¬£¬£¬£¬£¬£¬¶ø±»Íâ°ü¸øÉϲãworkflowµÄ¹æÔò×éºÏ¡£¡£¡£¡£¡£¡£
Google DeepMindËÑË÷Ê½ÍÆÀí
Tree-of-ThoughtsµÈÒªÁì½«ÍÆÀíÏÔʽ½¨Ä£ÎªËÑË÷ÎÊÌ⣬£¬£¬£¬£¬£¬£¬£¬Í¨¹ýÆÀ¹Àº¯ÊýÔÚºòѡ·¾¶ÖÐÑ¡Ôñ×îÓŽ⡣¡£¡£¡£¡£¡£
¸Ã·¶Ê½ÔÚÀëÏßÍÆÀíºÍÊýѧÎÊÌâÉÏÌåÏÖÎȹ̣¬£¬£¬£¬£¬£¬£¬£¬µ«ÔÚϵͳ²ãÃæ·ºÆð³öÁ½¸öÏÔ×ÅÌØÕ÷£ºÍÆÀíÀú³Ì¸ß¶Èͬ²½¡¢×èÖ¹Ìõ¼þÓÉËÑË÷Éî¶È»òÔ¤ËãÉÏÏÞ¾öÒé¡£¡£¡£¡£¡£¡£
ʵÖÊÉÏ£¬£¬£¬£¬£¬£¬£¬£¬ÕâÀàÒªÁìÓÅ»¯µÄÊÇ·¾¶ÖÊÁ¿£¬£¬£¬£¬£¬£¬£¬£¬¶ø²»ÊÇÔÚ²¢·¢¡¢ÑÓ³ÙÓë±¾Ç®Ô¼ÊøÏµľöÒéʱ»úÎÊÌâ¡£¡£¡£¡£¡£¡£
Anthropic/MetaÆô·¢Ê½Ðµ÷
AnthropicµÄconstitutional debate£¬£¬£¬£¬£¬£¬£¬£¬ÒÔ¼°Meta¡¢Stanford Ìá³öµÄ¶à Agent debate / society-of-minds£¬£¬£¬£¬£¬£¬£¬£¬ÒýÈëÁ˶àÖ÷Ìå½»»¥¡£¡£¡£¡£¡£¡£
ÔÚ¹¤³ÌʵÏÖÉÏ£¬£¬£¬£¬£¬£¬£¬£¬ÕâÀàϵͳͨ³£ÒÀÀµ£ºÀο¿agentÊý¡¢Àο¿ÂÖÊý¡¢barrier synchronization£¨ÆÚ´ýËùÓÐ agent Íê³É£©¡¢´ó¶¼Í¶Æ±»ò¹æÔò¾ÛºÏ¡£¡£¡£¡£¡£¡£
µ«ÕâЩ»úÖÆ²¢Ã»Óиø³öÎȹÌÒ»ÖÂÐÔµÄϵͳ½ç˵¡£¡£¡£¡£¡£¡£
µ±Ö÷Á÷Agentõè¾¶ÈÔÔÚÇ¿»¯¡¸ÔõÑù¸üºÃµØÍÆÀí¡¹£¬£¬£¬£¬£¬£¬£¬£¬½«¶àÖÇÄÜÌåÊÓÎªÍÆÀí¼¼Çɵĵþ¼Óʱ£¬£¬£¬£¬£¬£¬£¬£¬
Advaita ResearchµÄÕâÏîÑо¿°ÑÎÊÌâϳÁµ½ÁËϵͳ²ã£ºÔÚ¶à¸öËæ»úÍÆÀíÖ÷Ìå²¢ÐÐʱ£¬£¬£¬£¬£¬£¬£¬£¬ÔõÑù½ç˵¡¢ÑéÖ¤²¢Îȹ̸濢һÖ¡£¡£¡£¡£¡£¡£
°Ñ¶àÖÇÄÜÌåµ±ÉíÂþÑÜʽϵͳ
ÂÛÎÄÌá³öµÄ½¹µãÒªÁìϵͳΪAegean£¬£¬£¬£¬£¬£¬£¬£¬Æä»ù´¡Öع¹ÔÚÓÚ£º¶àÖÇÄÜÌåÍÆÀí²»ÔÙ±»ÊÓΪworkflow±àÅÅÎÊÌ⣬£¬£¬£¬£¬£¬£¬£¬¶ø±»½¨Ä£ÎªÒ»¸öÂþÑÜʽ¹²Ê¶Àú³Ì¡£¡£¡£¡£¡£¡£
²î±ðÓڹŰåÂþÑÜʽϵͳ£¬£¬£¬£¬£¬£¬£¬£¬ÖÇÄÜÌå¾öÒé·ºÆðËæ»ú²»È·¶¨ÐÔ£¬£¬£¬£¬£¬£¬£¬£¬Ê¹µÃÏÖÓй²Ê¶ÐÒé¼Ü¹¹ÎÞ·¨ÊÊÓᣡ£¡£¡£¡£¡£ÂÛÎÄÕë¶Ô¶àÖÇÄÜÌåÇéÐÎÌá³öÁËÐµĹ²Ê¶ÀíÂÛ¿ò¼Ü£¬£¬£¬£¬£¬£¬£¬£¬²¢¸ø³öÁËÑϽ÷µÄ¶àÖÇÄÜÌ干ʶµÄ׼ȷÐÔ½ç˵¡£¡£¡£¡£¡£¡£
ÂÛÎÄÖ®ºó»ùÓÚÀíÂÛ¿ò¼ÜÌá³öÁËÐµĹ²Ê¶ÐÒé¡£¡£¡£¡£¡£¡£Æä½¹µã»úÖÆ°üÀ¨Èýµã£º
£¨1£©Quorum-fast£¬£¬£¬£¬£¬£¬£¬£¬¶ø²»ÊÇwait-all
ϵͳ²»ÔÙÆÚ´ýËùÓÐagent£¬£¬£¬£¬£¬£¬£¬£¬Ö»ÒªµÖ´ï quorum ¼´Íƽø¾öÒ飬£¬£¬£¬£¬£¬£¬£¬ÑÓ³Ù²»ÔÙÓÉ×îÂý agent ¾öÒé¡£¡£¡£¡£¡£¡£
£¨2£©ÎȹÌÐÔ´°¿Ú£¨¦Â£©£¬£¬£¬£¬£¬£¬£¬£¬¶ø²»ÊÇ¡¸Ò»Ö¾ÍÍ£¡¹
Ò»ÖÂÐÔ±ØÐèÔÚʱ¼äά¶ÈÉÏÒ»Á¬±£´æ£¬£¬£¬£¬£¬£¬£¬£¬²Å»ª±»ÊÓΪÓÐÓù²Ê¶£¬£¬£¬£¬£¬£¬£¬£¬´Ó¶ø¹ýÂËÔÝʱÐԴ󶼡£¡£¡£¡£¡£¡£
£¨3£©Streaming¹²Ê¶Ó뼴ʱ×÷·Ï
ÔÚtokenÌìÉúÀú³ÌÖÐÒ»Á¬¼ì²â¹²Ê¶×´Ì¬£¬£¬£¬£¬£¬£¬£¬£¬Ò»µ©Öª×ãÎȹÌÌõ¼þ£¬£¬£¬£¬£¬£¬£¬£¬Á¬Ã¦ÖÕֹʣÓàÌìÉú¡£¡£¡£¡£¡£¡£
ÏêϸЧ¹ûÓëʵÑéÆÊÎö
ÂÛÎÄÖ¸³ö£º¶àÖÇÄÜÌåÍÆÀí£¬£¬£¬£¬£¬£¬£¬£¬ÊµÖÊÉÏÊÇÔËÐÐÔÚËæ»úÍÆÀíÖ÷ÌåÖ®ÉϵÄÂþÑÜʽ¹²Ê¶ÎÊÌâ¡£¡£¡£¡£¡£¡£
Ò»µ©È±·¦Ã÷È·µÄ¹²Ê¶ÓïÒ壬£¬£¬£¬£¬£¬£¬£¬¹¤³Ìʧ°Ü²¢·Çż·¢£¬£¬£¬£¬£¬£¬£¬£¬¶øÊÇ·ºÆð³ö¸ß¶È¿ÉÕ¹ÍûµÄϵͳÐÔģʽ¡£¡£¡£¡£¡£¡£
ÔÝʱÐÔÒ»Ö£º´ó¶¼²¢²»ÎȹÌÎÊÌâ
ÂÛÎÄϵͳÐÔÕÉÁ¿ÁËdecision flipÕ÷Ïó£¨ÔÚÏÖÓÐAgent workflowÖÐÏÕЩδ±»ÏÔʽ½¨Ä££©¡£¡£¡£¡£¡£¡£
Ч¹ûÏÔʾ£ºÔÚÒýÈëagent¼ä reasoning exchangeºó£¬£¬£¬£¬£¬£¬£¬£¬×¼È·ÂÊÌáÉýµÄͬʱ£¬£¬£¬£¬£¬£¬£¬£¬´ó¶¼¾öÒéÔÚÏàÁÚÂִα¬·¢·´×ªµÄƵÂÊÏÔÖøÉÏÉý¡£¡£¡£¡£¡£¡£
ÒÔMMLUΪÀý£º100¸öÑù±¾ÖзºÆð64´Î decision flip£¬£¬£¬£¬£¬£¬£¬£¬Òâζ×ÅϵͳÔÚÒ»Á¬ÂÖ´ÎÖÐÖØ¸´¸Ä±ä´ó¶¼½áÂÛ¡£¡£¡£¡£¡£¡£
ÔÚȱ·¦ÎȹÌÐÔÔ¼ÊøÊ±£¬£¬£¬£¬£¬£¬£¬£¬ÈκλùÓÚ¡¸Ä¿½ñ´ó¶¼¡¹µÄÌáǰ×èÖ¹»òͶƱ»úÖÆ£¬£¬£¬£¬£¬£¬£¬£¬¶¼¿ÉÄܱ¬·¢ÔÚtransient agreement£¨ÔÝʱÐÔÒ»Ö£©ÉÏ¡£¡£¡£¡£¡£¡£
Õâ²»ÊÇÍÆÀíÄÜÁ¦ÎÊÌ⣬£¬£¬£¬£¬£¬£¬£¬¶øÊǹ²Ê¶Î´±»½ç˵µÄÎÊÌâ¡£¡£¡£¡£¡£¡£
ͬ²½Ä£×Ó¹ýʧ£ºP99±»×îÂýagent½ç˵
Ä¿½ñ¶àAgentϵͳÆÕ±é½ÓÄÉbarrier synchronization£¬£¬£¬£¬£¬£¬£¬£¬ÂÛÎÄÔÚAIME£¨1 req/s£©³¡¾°Ï£¬£¬£¬£¬£¬£¬£¬£¬±ÈÕÕÁËÖ÷Á÷×ö·¨ÓëÒýÈ빲ʶ»úÖÆºóµÄϵͳÌåÏÖ£º
¶àAgent baseline£¨MaxRound = 6£©×îÂýÇëÇóΪ6571Ã룬£¬£¬£¬£¬£¬£¬£¬P99 ÑÓ³ÙΪ8749Ãë
ÒýÈ빲ʶ»úÖÆºó£¬£¬£¬£¬£¬£¬£¬£¬×îÂýÇëÇóÔ¼325Ã룬£¬£¬£¬£¬£¬£¬£¬P99ÑÓ³ÙΪ772 Ã룻£»£»£»£»£»
ÔÚÏàͬʹÃüÌõ¼þÏ£ºP99 ÑÓ³Ù¸ÄÉÆÔ¼11¡Á£¬£¬£¬£¬£¬£¬£¬£¬Æ½¾ùÑÓ³Ù¸ÄÉÆÔ¼20¡Á
¸Ã²î±ð²¢·ÇÀ´×ÔÄ£×ÓÍÆÀíÄÜÁ¦£¬£¬£¬£¬£¬£¬£¬£¬¶øÀ´×Ôͬ²½·¶Ê½´Ó¡¸µÈËùÓÐÈË¡¹×ªÏò¡¸¸æ¿¢¹²Ê¶¼´¿ÉÍÆ½ø¡¹¡£¡£¡£¡£¡£¡£
ËãÁ¦ÆÌÕÅ£ºtokenÏûºÄ±¬·¢ÔÚÊÕÁ²Ö®ºó
ÂÛÎĽøÒ»²½Á¿»¯Á˶àÖÇÄÜϵһÇÐÖкã¾Ã±»ºöÊÓµÄÎÊÌ⣺ÊÕÁ²Ö®ºóµÄÎÞЧÅÌËã¡£¡£¡£¡£¡£¡£
ÔÚ¶à¸ö»ù׼ʹÃüÉÏ£¬£¬£¬£¬£¬£¬£¬£¬ÒýÈëAgentic Consensusºó£º
GSM8K£º4.4¡ÁïÔÌ£¨Ô¼ 1.3K vs 5.7K£©
MMLU£º3.3¡ÁïÔÌ£¨Ô¼ 3.3K vs 10.7K£©
AIME£º1.3¡ÁïÔÌ£¨Ô¼ 46.0K vs 59.9K£©
IMO£º1.1¡ÁïÔÌ£¨Ô¼ 64.8K vs 73.8K£©
Óë´Ëͬʱ£¬£¬£¬£¬£¬£¬£¬£¬accuracy²¨¶¯±»¿ØÖÆÔÚÔ¼2.5%ÒÔÄÚ¡£¡£¡£¡£¡£¡£
ÕâÅú×¢£ºtoken±¾Ç®Ï½µÀ´×Ô¹²Ê¶Çý¶¯µÄÔçÍ£Óë×÷·Ï»úÖÆ£¬£¬£¬£¬£¬£¬£¬£¬¶ø²»ÊÇͨ¹ýÎþÉüÖÊÁ¿ÊµÏÖ¡£¡£¡£¡£¡£¡£
Êý×ÖÃè»æÁËϵͳ½çÏß
ÔÚÒýÈëAgentic Consensus£¨Advaita Research Ìá³öµÄ¶àÖÇÄÜÌ干ʶ½¨Ä£ÒªÁ죩ºó£¬£¬£¬£¬£¬£¬£¬£¬ÏµÍ³ÐÐΪ·ºÆðÁËÇåÎúµÄÊýÄ¿¼¶×ª±ä£ºÆ½¾ùÑÓ³Ù½µµÍ1.2¨C20¡Á£¬£¬£¬£¬£¬£¬£¬£¬P99βÑÓ³Ù×î¸ß¸ÄÉÆ11¡Á£¬£¬£¬£¬£¬£¬£¬£¬tokenÏûºÄ½µµÍ1.1¨C4.4¡Á£¬£¬£¬£¬£¬£¬£¬£¬accuracy²¨¶¯Ô¼2.5%
ÕâЩָ±êÅäºÏÖ¸Ïòͳһ¸öϵͳ¼¶½áÂÛ£º¶àÖÇÄÜÌåÍÆÀíµÄÐÔÄÜÆ¿¾±£¬£¬£¬£¬£¬£¬£¬£¬²¢²»À´×ÔÄ£×ÓÄÜÁ¦£¬£¬£¬£¬£¬£¬£¬£¬¶øÀ´×ÔÐ×÷»úÖÆÊÇ·ñ¾ß±¸¿É²Ù×÷µÄ¹²Ê¶ÓïÒå¡£¡£¡£¡£¡£¡£
¹¤³ÌÅжÏÓëÓ¦ÓÃÔ¶¾°
Agentic Consensus²¢²»ÊÇÒ»¸ö¸½¼ÓÄÜÁ¦£¬£¬£¬£¬£¬£¬£¬£¬¶øÊÇÒ»ÌõÃ÷È·µÄϵͳ·Ö½çÏß¡£¡£¡£¡£¡£¡£
µ±Agent×÷ÎªÕæÊµÏµÍ³ÖеÄÐж¯µ¥Î»ÔËÐÐʱ£¬£¬£¬£¬£¬£¬£¬£¬ÎÊÌâ²»ÔÙÊÇ£º¡¸µ¥¸öÄ£×ÓÄÜ·ñÍÆÀíµÃ¸üºÃ¡¹£¬£¬£¬£¬£¬£¬£¬£¬¶øÊÇÔÚ¶à¸öËæ»úÍÆÀíÖ÷Ìå²¢ÐеÄÇéÐÎÏ£¬£¬£¬£¬£¬£¬£¬£¬ÏµÍ³ÊÇ·ñ¾ß±¸¿ÉÅжϡ¢¿É×èÖ¹¡¢¿ÉÀ©Õ¹µÄÒ»ÖÂÐÔÓïÒå¡£¡£¡£¡£¡£¡£
ÂÛÎĸø³öµÄ½¹µãÅжϱê×¼ÊÇ£ºÈôÊÇÒ»¸ö¶àÖÇÄÜϵһÇÐÎÞ·¨Ã÷È·»Ø¸²¡¸ºÎʱËã¸æ¿¢Ò»Ö¡¢ºÎʱ¿ÉÒÔÇå¾²×èÖ¹¡¢ÑÓ³ÙÓÉ˾öÒ项£¬£¬£¬£¬£¬£¬£¬£¬ÄÇËüÔÚ¹¤³ÌÉÏÈÔÍ£ÁôÔÚworkflow£¬£¬£¬£¬£¬£¬£¬£¬¶ø·Çϵͳ¡£¡£¡£¡£¡£¡£
´ÓÕâ¸ö½Ç¶È¿´£ºdecision flip¡¢P99±»×îÂýagent½ç˵¡¢ÊÕÁ²ºóµÄtokenÆÌÕÅ£¬£¬£¬£¬£¬£¬£¬£¬¶¼²»ÊÇʵÏÖϸ½ÚÉϵÄ覴㬣¬£¬£¬£¬£¬£¬£¬¶øÊÇϵͳÉÐδ½øÈ롸¹²Ê¶¿É²Ù×÷½×¶Î¡¹µÄÐźš£¡£¡£¡£¡£¡£
Advaita ResearchµÄÕâÏîÊÂÇ飬£¬£¬£¬£¬£¬£¬£¬²¢²»ÊÇÌá³öÒ»ÖÖеÄAgentÍæ·¨£¬£¬£¬£¬£¬£¬£¬£¬¶øÊǰÑAgentic ConsensusÌáÉýΪһ¸ö¹¤³ÌÅжϱê×¼£º¶àÖÇÄÜÌåÍÆÀí£¬£¬£¬£¬£¬£¬£¬£¬ÊÇ·ñÒѾ´Ó¡¸ÍÆÀí¼¼Çɵĵþ¼Ó¡¹£¬£¬£¬£¬£¬£¬£¬£¬ÂõÈ롸¾ß±¸¿ÉÑéÖ¤¹²Ê¶ÓïÒåµÄϵͳ¡¹¡£¡£¡£¡£¡£¡£
µ±Õâ¸ö±ê×¼½¨É裬£¬£¬£¬£¬£¬£¬£¬¶àÖÇÄÜÌå²Å»ªÕæÕý´Ódemo×ßÏòproduction£»£»£»£»£»£»µ±Ëü²»½¨É裬£¬£¬£¬£¬£¬£¬£¬ÔÙÖØ´óµÄÍÆÀíÁ÷³Ì£¬£¬£¬£¬£¬£¬£¬£¬Ò²Ö´ÙÇÔÚͬ²½±¾Ç®Ö®Éϵþ¼ÓÅÌËã¡£¡£¡£¡£¡£¡£
²Î¿¼×ÊÁÏ£º
https://arxiv.org/pdf/2512.20184
![]()

