SwingArena£º´Ó¡¸Ð´¶Ô´úÂëCommit¡¹µ½¡¸Í¨¹ýCIÉó²é¡¹
2026-03-01 00:00:20

ÒÑÍùÒ»Ä꣬£¬£¬ £¬£¬ £¬´óÄ£×Óд´úÂëµÄÄÜÁ¦ÏÕЩÒÔÈâÑۿɼûµÄËÙÂÊÌáÉý¡£¡£¡£¡£¡£´Ó¼òÆÓ¾ç±¾µ½ÍêÕû¹¦Ð§Ä£¿£¿£¿£¿£¿£¿é£¬£¬£¬ £¬£¬ £¬GPT¡¢Claude¡¢DeepSeek µÈÄ£×ÓÒѾ­Äܹ»ÔÚ¼¸ÃëÖÓÄÚÌìÉú¿´ÆðÀ´Ï൱ ¡°×¨Òµ¡± µÄ´úÂë¡£¡£¡£¡£¡£

ÕâÖÖÄÜÁ¦µÄÌáÉý£¬£¬£¬ £¬£¬ £¬ÈÃÐí¶àÈË×îÏÈÈÏÕæË¼Ë÷Ò»¸öÎÊÌ⣺AI Äܲ»¿ÉÕæÕý¼ÓÈëµ½Èí¼þ¹¤³ÌµÄ½¹µãÁ÷³ÌÖУ¿£¿£¿£¿£¿£¿

µ«Ô½¿¿½üÕæÊµ¿ª·¢£¬£¬£¬ £¬£¬ £¬Õâ¸öÎÊÌâ¾ÍÔ½ÏÔµÃÖØ´ó¡£¡£¡£¡£¡£ÓÉÓÚÔÚ¹¤Òµ½ç£¬£¬£¬ £¬£¬ £¬¡°Ð´³öÒ»¶ÎÄÜÅܵĴúÂ롱 Ô¶Ô¶²»·ó¡£¡£¡£¡£¡£

´úÂëÊÇ·ñÄܱ»ºÏ²¢£¬£¬£¬ £¬£¬ £¬È¡¾öÓÚËüÄÜ·ñͨ¹ýÍêÕûµÄÒ»Á¬¼¯³É£¨Continuous Integration£¬£¬£¬ £¬£¬ £¬¼ò³Æ CI£©Á÷Ë®Ïß¡ª¡ªÕâÊÇÒ»ÖÖÔÚ´úÂ뿪·¢Àú³ÌÖУ¬£¬£¬ £¬£¬ £¬Í¨¹ý×Ô¶¯»¯µÄ¹¹½¨¡¢²âÊԺʹúÂë¼ì²é£¬£¬£¬ £¬£¬ £¬È·±£Ã¿Ò»´Î¸Ä¶¯¶¼ÄÜÔÚÕæÊµ¹¤³ÌÇéÐÎÏÂÎȹÌÔËÐеĻúÖÆ¡£¡£¡£¡£¡£

±ðµÄ£¬£¬£¬ £¬£¬ £¬´úÂ뻹ÐèÇкÏÏîÄ¿¹æ·¶¡¢¾­µÃÆð´úÂëÉó²é£¬£¬£¬ £¬£¬ £¬²¢ÔÚ¶àÂÖÐÞ¸ÄÖмá³ÖÎȹ̿ɿ¿¡£¡£¡£¡£¡£Òź¶µÄÊÇ£¬£¬£¬ £¬£¬ £¬ÏÖÓÐÖ÷Á÷´úÂëÆÀ²â»ù×¼£¬£¬£¬ £¬£¬ £¬ÏÕЩ¶¼Í£ÁôÔÚ¡°ÄÜ·ñͨ¹ý¼¸¸öµ¥Î»²âÊÔ¡±µÄ²ãÃæ¡£¡£¡£¡£¡£

SwingArena µÄÆðµã£¬£¬£¬ £¬£¬ £¬ÕýÊÇÌî²¹Õâ¿éºã¾ÃȱʧµÄÆÀ²â¿Õȱ¡£¡£¡£¡£¡£

¸ÃÂÛÎÄÒѱ»ICLR 2026ÕýʽÎüÊÕ¡£¡£¡£¡£¡£ÏÖÔÚ£¬£¬£¬ £¬£¬ £¬SwingArena ÒÑʵÏÖȫջ¿ªÔ´¡£¡£¡£¡£¡£

ÂÛÎÄÎÊÌ⣺SwingArena: Competitive Programming Arena for Long-context GitHub Issue SolvingÂÛÎÄÁ´½Ó£ºhttps://arxiv.org/abs/2505.23932ÏîÄ¿Á´½Ó£ºhttps://swing-bench.github.io/

´Ó ¡°Ð´¶Ô´úÂ롱 µ½ ¡°Í¨¹ýÉó²é¡±£¬£¬£¬ £¬£¬ £¬

ÆÀ²âÂß¼­ÐèÒªÒ»´ÎתÏò

ÔڹŰåÆÀ²âÖУ¬£¬£¬ £¬£¬ £¬Ä£×ÓÃæÁÙµÄÊÇÒ»¸ö¸ß¶È¼ò»¯µÄÎÊÌ⣺¸ø¶¨º¯ÊýÊðÃûºÍ˵Ã÷£¬£¬£¬ £¬£¬ £¬Ö»ÒªÊä³öÄÜͨ¹ý²âÊÔµÄʵÏÖ¼´¿É¡£¡£¡£¡£¡£ÕâÖÖÉ趨¹ØÓÚȨºâ»ù´¡±à³ÌÄÜÁ¦ÊÇÓÐÓõ쬣¬£¬ £¬£¬ £¬µ«ËüºöÂÔÁËÕæÊµÈí¼þ¿ª·¢ÖÐ×îÒªº¦µÄÒ»»· ¡ª¡ªÉó²éÓëµü´ú¡£¡£¡£¡£¡£

ÔÚÏÖʵÖУ¬£¬£¬ £¬£¬ £¬Ò»¶Î´úÂëÍùÍùÒªÂÄÀú¶à¸ö»ØºÏµÄ·´ÏìÓëÐ޸쬣¬£¬ £¬£¬ £¬²Å»ª×îÖÕ±»½ÓÊÜ¡£¡£¡£¡£¡£CI ϵͳ»á×Ô¶¯¼ì²é±àÒë¡¢²âÊÔ¡¢´úÂëÆø¸ÅºÍDZÔÚΣº¦£¬£¬£¬ £¬£¬ £¬¶øÉó²éÕßÔò»á´ÓÂß¼­×¼È·ÐÔ¡¢½çÏßÇéÐκͿÉά»¤ÐԵȽǶÈÒ»Ö±Ìá³öÖÊÒÉ¡£¡£¡£¡£¡£ÕâÖÖÀú³Ì£¬£¬£¬ £¬£¬ £¬ÊµÖÊÉÏÊÇÒ»ÖÖÒ»Á¬²©ÞÄ¡£¡£¡£¡£¡£

SwingArena ½«ÕâÖÖ²©ÞÄÒýÈëÆÀ²âÖ®ÖС£¡£¡£¡£¡£Ëü²»ÔÙÈÃÄ£×Ó ¡°µ¥´ò¶À¶·¡±£¬£¬£¬ £¬£¬ £¬¶øÊÇͨ¹ý¶Ô¿¹Ê½É趨£¬£¬£¬ £¬£¬ £¬ÈÃÁ½¸öÄ£×Ó»®·ÖÊÎÑÝ ¡°Ìá½»Õß¡± ºÍ ¡°Éó²éÕß¡±£¬£¬£¬ £¬£¬ £¬ÔÚÕæÊµ CI ÇéÐÎÖÐÖØ¸´½»·æ¡£¡£¡£¡£¡£

Ìá½»ÕßÐèҪд³ö×ã¹»ÎȽ¡µÄ²¹¶¡²Å»ªÍ¨¹ýÁ÷Ë®Ïߣ¬£¬£¬ £¬£¬ £¬¶øÉó²éÕßÔòÊÔͼͨ¹ýÈ«ÐÄÉè¼ÆµÄ²âÊÔ̻¶DZÔÚÎÊÌâ¡£¡£¡£¡£¡£×îÖյĵ÷֣¬£¬£¬ £¬£¬ £¬ÍêÈ«ÓÉÕæÊµÖ´ÐÐЧ¹û¾öÒé¡£¡£¡£¡£¡£

ÕæÊµ¹¤³ÌÇéÐΣ¬£¬£¬ £¬£¬ £¬Òâζ×ÅÕæÊµÖØÆ¯ºó

ÒªÈÃÆÀ²âÕæÕýÌù½ü¹¤Òµ³¡¾°£¬£¬£¬ £¬£¬ £¬½öÓжԿ¹»úÖÆ»¹²»·ó¡£¡£¡£¡£¡£ÁíÒ»¸ö¸üÏÖʵµÄÌôÕ½ÔÚÓÚ£ºÕæÊµÏîÄ¿µÄ´úÂë¹æÄ££¬£¬£¬ £¬£¬ £¬Ô¶Ô¶Áè¼ÝÁË´óÄ£×ÓµÄÉÏÏÂÎÄ´°¿Ú¡£¡£¡£¡£¡£

Ò»¸ö³£¼ûµÄ¿ªÔ´¿ÍÕ»ÍùÍù°üÀ¨ÊýÍòÐдúÂ룬£¬£¬ £¬£¬ £¬ÂþÑÜÔÚÊý°Ù¸öÎļþÖС£¡£¡£¡£¡£Ä£×Ó²»¿ÉÄÜ ¡°Í¨¶ÁÈ«¿â¡±£¬£¬£¬ £¬£¬ £¬Ö»ÄÜÔÚ¼«ÆäÓÐÏÞµÄÉÏÏÂÎÄÖÐ×öÅжϡ£¡£¡£¡£¡£SwingArena Òò´ËÉè¼ÆÁËÒ»Ì×ÍêÕûµÄ¼ìË÷ÔöÇ¿Á÷Ë®Ïß RACG£¨Retrieval-Augmented Code Generation£©£¬£¬£¬ £¬£¬ £¬ÊÔͼÔÚ ¡°¸øÄ£×Ó¼¸¶à´úÂ롱 Óë ¡°¸ø¶Ô´úÂ롱 Ö®¼äÈ¡µÃƽºâ¡£¡£¡£¡£¡£

RACG µÄ½¹µã˼Ð÷£¬£¬£¬ £¬£¬ £¬ÊÇÏÈͨ¹ý¾­µäÐÅÏ¢¼ìË÷ÒªÁì¿ìËÙËõСÎļþ¹æÄ££¬£¬£¬ £¬£¬ £¬ÔÙÒÔÓï·¨½á¹¹Îªµ¥Î»¶Ô´úÂë¾ÙÐÐÇп飬£¬£¬ £¬£¬ £¬²¢Ê¹ÓÃÓïÒåÄ£×Ó¾ÙÐо«ÅÅ¡£¡£¡£¡£¡£ÔÚÑÏ¿áµÄ token Ô¤ËãÏ£¬£¬£¬ £¬£¬ £¬ÏµÍ³»á¶¯Ì¬µ÷½âÉÏÏÂÎÄÁ£¶È£¬£¬£¬ £¬£¬ £¬È·±£Ä£×Ó¿´µ½µÄÊÇ×îÒªº¦¡¢×îÏà¹ØµÄ´úÂëÆ¬¶Ï£¬£¬£¬ £¬£¬ £¬¶ø²»ÊÇÔëÉù¡£¡£¡£¡£¡£

ÏûÈÚʵÑéÏÔʾ£¬£¬£¬ £¬£¬ £¬ÕâÖÖ·Ö²ã¼ìË÷Õ½ÂÔ£¬£¬£¬ £¬£¬ £¬Äܹ»ÏÔÖøÌáÉý²¹¶¡¶¨Î»µÄ׼ȷÂÊ£¬£¬£¬ £¬£¬ £¬Ïà±È½öʹÓÃÒªº¦´ÊÆ¥Å䣬£¬£¬ £¬£¬ £¬Top-10 ÖÀÖÐÂÊÌáÉýÁè¼ÝÒ»±¶¡£¡£¡£¡£¡£ÕâÒâζ×ÅÄ£×Ó²»µ«ÊÇ ¡°Ð´´úÂ롱£¬£¬£¬ £¬£¬ £¬¶øÊÇÔÚ¸ü¿¿½üÈËÀ๤³ÌʦµÄÈÏÖª¹æÄ£ÄÚÊÂÇé¡£¡£¡£¡£¡£

µ±Ä£×ÓÕæÕý¶Ô¿¹£¬£¬£¬ £¬£¬ £¬²î±ð²Å×îÏÈÕ¹ÏÖ

ÔÚ SwingArena µÄÆÀ²âÖУ¬£¬£¬ £¬£¬ £¬Ò»¸öÓÐȤµÄÕ÷ÏóÖ𽥸¡ÏÖ£º²î±ðÄ£×ÓÔÚ¹¤³Ì¾öÒéÉ쵀 ¡°ÐÔ¸ñ²î±ð¡±£¬£¬£¬ £¬£¬ £¬±»Ø¨¹ÅδÓеطŴóÁË¡£¡£¡£¡£¡£

ÒÔ GPT-4o ΪÀý£¬£¬£¬ £¬£¬ £¬ËüÔÚÌá½»Õß½ÇÉ«ÖÐÌåÏֵü«Îª¼¤½ø£¬£¬£¬ £¬£¬ £¬ÍùÍùÄܹ»¿ìËÙÌìÉú×ãÒÔ»÷°ÜµÐÊÖ²âÊԵIJ¹¶¡£¡£¡£¡£¡£¬£¬£¬ £¬£¬ £¬Òò´ËʤÂʺܸߡ£¡£¡£¡£¡£µ«ÕâÖÖÕ½ÂԵļÛÇ®ÊÇ CI ͨ¹ýÂʲ¢²»Îȹ̣¬£¬£¬ £¬£¬ £¬´úÂëÔڹ淶ÐԺͳ°ôÐÔÉϸüÈÝÒ×·ºÆðÎÊÌâ¡£¡£¡£¡£¡£

Ïà±È֮ϣ¬£¬£¬ £¬£¬ £¬DeepSeek ºÍ Gemini µÄÌåÏÖÔòÏÔןüÎªÊØ¾É¡£¡£¡£¡£¡£ËüÃÇÌìÉúµÄ´úÂëÆø¸ÅÔ½·¢¹æ·¶£¬£¬£¬ £¬£¬ £¬Í¨¹ý CI µÄ¸ÅÂÊÒ²¸ü¸ß£¬£¬£¬ £¬£¬ £¬ÓÈÆäÔÚ¶àÓïÑÔ³¡¾°ÏÂÕ¹ÏÖ³ö¸üÇ¿µÄÎȹÌÐÔ¡£¡£¡£¡£¡£ÕâÀà²î±ð£¬£¬£¬ £¬£¬ £¬ÔڹŰå»ù×¼ÖÐÍùÍù±» ¡°Æ½¾ù·Ö¡± ËùÑÚÊΣ¬£¬£¬ £¬£¬ £¬¶øÔÚ¶Ô¿¹Ê½ÆÀ²âÖÐÈ´±äµÃºÜÊÇÖ±¹Û¡£¡£¡£¡£¡£

¸üÖ÷ÒªµÄÊÇ£¬£¬£¬ £¬£¬ £¬ÕâЩЧ¹ûΪÏÖʵӦÓÃÌṩÁËÇåÎúµÄ²Î¿¼£ºµ±Ä¿µÄÊÇ¿ìËÙÔ­ÐͺÍ̽Ë÷ÐÔ¿ª·¢Ê±£¬£¬£¬ £¬£¬ £¬¼¤½øÕ½ÂÔ¿ÉÄܸüÓÐÓ㻣»£»£»£»£»£»¶øÔÚÉú²úÇéÐκͺã¾ÃÏîÄ¿ÖУ¬£¬£¬ £¬£¬ £¬ÎȹÌÐÔÏÔÈ»¸üÖ÷Òª¡£¡£¡£¡£¡£

´ÓÆÀ²âµ½Êµ¼ù£º

Ϊʲô SwingArena ÖµµÃ±»ÖØÊÓ

SwingArena µÄÒâÒ壬£¬£¬ £¬£¬ £¬²¢²»µ«½öÔÚÓÚÌá³öÁËÒ»¸öÐ嵀 benchmark¡£¡£¡£¡£¡£Ëü¸üÖ÷ÒªµÄ¼ÛÖµ£¬£¬£¬ £¬£¬ £¬ÔÚÓÚÍÆ¶¯ÁËÒ»´ÎÆÀ²âÊӽǵÄת±ä£º´Ó ¡°¹¦Ð§×¼È·ÐÔ¡± ×ßÏò ¡°¹¤³Ì¿ÉÓÃÐÔ¡±¡£¡£¡£¡£¡£

ͨ¹ý½« CI Á÷Ë®Ïß¡¢´úÂëÉó²éºÍ¶àÂÖµü´úÒýÈëÆÀ²âÀú³Ì£¬£¬£¬ £¬£¬ £¬SwingArena ÈÃÎÒÃǵÚÒ»´ÎÄܹ»ÏµÍ³ÐԵػظ²ÕâÑùµÄÎÊÌ⣺ÄÄЩģ×ÓÕæµÄÊʺϽøÈëÉú²úÇéÐΣ¿£¿£¿£¿£¿£¿ÔÚ²î±ð¹¤³Ì³¡¾°Ï£¬£¬£¬ £¬£¬ £¬Ó¦¸ÃÔõÑùÑ¡ÔñºÍʹÓÃËüÃÇ£¿£¿£¿£¿£¿£¿ÓÖ¸ÃÔõÑùÉè¼Æ¸üÇкÏÏÖʵÐèÇóµÄ AI ±à³ÌÖúÊÖ£¿£¿£¿£¿£¿£¿

ÔÚÂÛÎÄÄäÃûÆÚ¿¢Êºó£¬£¬£¬ £¬£¬ £¬SwingArena ½«ÍêÕû¿ªÔ´£¬£¬£¬ £¬£¬ £¬°üÀ¨Êý¾Ý¼¯¡¢ÆÀ²â¿ò¼Ü¡¢¼ìË÷Á÷Ë®ÏßÒÔ¼°ËùÓÐʵÑ鸴ÏÖ´úÂë¡£¡£¡£¡£¡£ÍŶÓÏ£Íû£¬£¬£¬ £¬£¬ £¬ÕâÌ׿ò¼Ü²»µ«ÄܳÉΪÑо¿Õß½ÏÁ¿Ä£×ÓµÄй¤¾ß£¬£¬£¬ £¬£¬ £¬Ò²ÄÜΪ¹¤Òµ½çÆÀ¹ÀºÍÂ䵨 AI ±à³ÌÄÜÁ¦Ìṩ²Î¿¼¡£¡£¡£¡£¡£

µ± AI ÌìÉúµÄ´úÂëÕæÕý×ß½ø CI Á÷Ë®Ïߣ¬£¬£¬ £¬£¬ £¬ÆÀ²âµÄ±ê×¼£¬£¬£¬ £¬£¬ £¬Ò²±ØÐèËæÖ®Éý¼¶¡£¡£¡£¡£¡£

SwingArena£¬£¬£¬ £¬£¬ £¬ÕýÊÇÏòÔÆ¾³ÂùÝÖÎÀíÓÐÏÞ¹«Ë¾Õâ¸öÆ«ÏòÂõ³öµÄÒ»²½¡£¡£¡£¡£¡£