×÷ÕߣºÈËÃñÖпÆÑз¢ÖÐÐÄÕų¯
ÕªÒª: ÔÚ¹ýÈ¥µÄºÜ³¤Ê±¼äÀ¼ÆËã»úÊÓ¾õÁìÓòÒÀ¿¿´ó¹æÄ£µÄÓбê×¢Êý¾Ý¼¯È¡µÃÁ˺ܴóµÄ³É¹¦£¬ÌرðÊǾí»ýÉñ¾ÍøÂçµÄÓ¦Óã¬Ê¹µÃÊÓ¾õ¸÷×ÓÁìÓòʵÏÖÁË¿çԽʽ·¢Õ¹£¬Ñ§Êõ½çºÍ¹¤Òµ½ç¿ªÊ¼Í¶Èë´óÁ¿µÄÑо¿ºÍÓ¦Óã¬Ò»¶Èʹ´ó¼ÒÏàÐÅ£¬È˹¤ÖÇÄܵĴóÏü´½«½¨³É¡£È»¶ø£¬×î½ü¹ØÓÚ×Լලѧϰ£¨Self-supervised
Learning£¬SSL£©¡¢Transformer¡¢MLPµÈÔÚѧÊõ½çµÄÑо¿³ÉΪÈȵ㣬ÌرðÊÇTransformerºÍMLPµÄ½ø»÷£¬´óÓÐÒª½«¼à¶½Ñ§Ï°ºÍ¾í»ý½á¹¹ÅÄËÀÔÚɳ̲ÉϵĽÚ×࣬×÷ÕßÏàÐÅ£¬¼ÆËã»úÊÓ¾õ£¨CV£©ÁìÓòÕýÔÚ½øÈëеıä¸ïʱ´ú¡£
±¾ÎÄÖ÷Òª¾Û½¹ÓÚCVÁìÓò×ԼලѧϰµÄÏà¹ØÄÚÈÝ£¬°üº¬»ù±¾¸ÅÄÓëÊÓ¾õ¸÷ÁìÓòµÄ¹ØϵºÍÓ¦Óã¬ÒÔ¼°µ±Ç°µÄ½øÕ¹ºÍһЩ˼¿¼£¬¹ØÓÚ¾ßÌåµÄ×Լලѧϰ·½·¨ÔÀíºÍ¼¼ÊõÓÐÌ«¶àµÄÎÄÕ½øÐнâ¶Á£¬±¾ÎÄÔݲ»Éæ¼°£¬Á¦Çó´ÓÆäËû½Ç¶ÈÈ¥¹Û²ì×ԼලѧϰµÄÌصãºÍµ±Ç°µÄ¾ÖÏÞÐÔ£¬·ÖÎö×ܽá¾Ñ飬ÒÔÇóÄܸø´ó¼Ò¸ü¶à´´ÐµÄÏë·¨Æô·¢¡£ÓÉÓÚ×÷Õß±¾ÈËÒ²ÓкܴóµÄ¾ÖÏÞÐÔ£¬Ò»Ð©¹Ûµã²»ÃâÓÐÆ«ÆÄ£¬»¹Íû¸÷λ´óÀÐÅúÆÀÖ¸Õý¡£
Ò»¡¢×Լලѧϰ½éÉÜ
AAAI2020»áÒéÉÏ£¬Yann
LeCun×öÁË×ԼලѧϰµÄ±¨¸æ£¬±íʾ×ԼලѧϰÊÇÈ˹¤ÖÇÄܵÄδÀ´¡£´Ó2019Äêµ×ÖÁ½ñ£¬MoCoϵÁУ¬SimCLR£¬BYOLµÈһϵÁз½·¨µÈ¾®Åç·¢Õ¹£¬Í¨¹ýÎÞ±ê×¢Êý¾Ý¼¯´ïµ½ÁËÓбê×¢Êý¾Ý¼¯ÉϵÄЧ¹û£¬¼¸ºõËùÓеÄÏÂÓÎÈÎÎñ¶¼»ñµÃÊÕÒ棬ʹÆä³ÉΪÁËCV¸÷ÁìÓòµÄÑо¿ÈÈÃÅ¡£×ԼලѧϰµÄÓÅÊÆ£¬¾ÍÊÇ¿ÉÒÔÔÚÎÞ±êÇ©µÄÊý¾ÝÉÏÍê³ÉѵÁ·£¬¶ø¼à¶½Ñ§Ï°ÐèÒª´óÁ¿µÄÓбêÇ©Êý¾Ý£¬Ç¿»¯Ñ§Ï°ÐèÒªÓë»·¾³µÄ´óÁ¿½»»¥³¢ÊÔ£¬Êý¾ÝΪÍõµÄʱ´ú£¬´ËÌصãҲʹµÃ´ó¼Ò³ä·ÖÏàÐÅ×Լලѧϰ²ÅÊÇÈ˹¤ÖÇÄܵķ¢Õ¹·½Ïò¡£
×ԼලѧϰÊÇÓë´ó¼ÒÊìϤµÄ¼à¶½Ñ§Ï°ºÍÎ޼ලѧϰµÄÐÂÃû´Ê£¬¸ÃÀà·½·¨×îÔç¹éÀàÓÚÎ޼ලѧϰµÄ·¶³ë¡£¹ØÓÚ×ԼලѧϰµÄ¸ÅÄPaper with
code[1]¸ø³öµÄ¶¨ÒåÊÇ£¬Ê¹ÓÃÎÞ±ê×¢Êý¾ÝÓÃ×ÔÎҼලµÄ·½Ê½Ñ§Ï°Ò»ÖÖ±íʾµÄ·½·¨£¬¾ßÌ巽ʽÊÇͨ¹ýѧϰһ¸öÌæ´úÈÎÎñ£¨pretext
task£©µÄÄ¿±êº¯ÊýÀ´»ñÈ¡ÌØÕ÷±íʾ¡£Ìæ´úÈÎÎñ¿ÉÒÔÊÇÒ»¸öÔ¤²âÀàÈÎÎñ¡¢Éú³ÉʽÈÎÎñ¡¢¶Ô±ÈѧϰÈÎÎñ¡£Ìæ´úÈÎÎñµÄ¼à¶½ÐÅÏ¢À´Ô´ÊÇ´ÓÊý¾Ý±¾Éí»ñµÃµÄ¡£¾Ù¸öÀõ×Ó£¬Ìæ´úÈÎÎñ¿ÉÒÔÊÇͼƬÉÏÉ«£¬Í¼Æ¬¿ÙͼλÖÃÔ¤²â£¬ÊÓƵ֡˳ÐòÔ¤²âµÈ¡£»òÕßÎÒÃÇ´Ó½á¹û·´ÍÆ·½·¨£¬¶ÔÓÚ×ԼලÀ´Ëµ£¬Êý¾Ý±¾ÉíÊÇûÓбêÇ©µÄ£¬ÎÒÃÇÐèҪͨ¹ý×ÔÐÐÉè¼ÆÈÎÎñÀ´È·¶¨Êý¾ÝµÄ±êÇ©¡£ÀýÈçÏÂͼ[2]ÖаÑͼƬÖп۳ö9¸ö¿é£¬ÈÃÄ£ÐÍÔ¤²âÿ¸ö¿éµÄλÖ㬶Ôÿ¸ö¿é×Ô¶¯¹¹½¨±êÇ©µÄ¹ý³Ì¾ÍÊÇÉú³É±êÇ©µÄ¹ý³Ì£¬Ô¤²âλÖõŤ×÷¾ÍÊÇÌæ´úÈÎÎñ¡£
ͼ1 ͼÏñ¿éÏà¶ÔλÖÃÔ¤²â
½üÀ´ÈÈÃÅÇÒЧ¹û³öÉ«µÄ×ԼලģÐÍMoCoϵÁС¢SimCLRµÈ£¬³ýÈ¥BYOLºÍSimSiamÉáÆúÁ˸ºÑù±¾Êý¾Ý£¬»ù±¾¶¼ÊDzÉÓÃÕý¸ºÑù±¾¶Ô¼ä¶Ô±ÈµÄ·½Ê½À´¹¹½¨£¬BYOLºÍSimSiamÒ²¹¹½¨ÁËÁ½¸öÍøÂç¼äµÄ¶Ô±ÈÐÎʽ£¬¶¼ÊôÓÚ¶Ô±Èѧϰ£¨Contrastive
Learning£©ÈÎÎñµÄ·¶³ë£¬¿ÉÒÔ˵£¬µ±Ç°µÄ×ԼලѧϰµÄ»ðÈȾÍÊǶԱÈѧϰ×Լල·½·¨µÄ»ðÈÈ¡£Æä»ù±¾ÔÀí£¬ÊDzÉÓÃSiameseÐÎʽµÄÍøÂç½á¹¹£¬Í¨¹ýÊäÈëÕý¸ºÑù±¾¶ÔÊý¾Ý£¬¼ÆËãÍøÂçÁ½¸ö·ÖÖ§µÄÊä³öµÄËðʧ£¬ÒÔʹÍøÂçÄܹ»Ñ§Ï°µ½¿ÉÒÔ½«ÏàËÆÑù±¾À½ü£¬²»ÏàËÆÑù±¾ÀÔ¶µÄÌØÕ÷¡£×Ô¶¯¹¹½¨±êÇ©µÄ¹ý³Ì£¬¾ÍÊdz£Óõĸ÷ÖÖÊý¾ÝÔöÇ¿·½·¨£¬ÈçÏÂͼ[3]£¬Ôʼͼ²ÉÓÃËæ»ú²Ã¼ô¡¢ÑÕÉ«±ä»»¡¢Ä£ºýµÈ·½Ê½¹¹ÔìÏàËÆÑù±¾¶Ô£¬¶ø²»Í¬µÄÔͼ»òÕßÔöÇ¿ºóͼÏñ¼´ÊÇ·ÇÏàËÆÑù±¾¶Ô¡£ÑµÁ·µÃµ½µÄ¶Ô±ÈѧϰÍøÂ磬ÔÚǨÒƵ½ÏÂÓÎÈÎÎñ£¨·ÖÀà¡¢¼ì²â¡¢·Ö¸î£©µÈÊý¾Ý¼¯Ê±£¬±íÏÖ´ïµ½ÁË¿ÉÒÔæÇÃÀ¼à¶½Ñ§Ï°Ä£Ð͵ÄЧ¹û¡£
ͼ2 SimCLRʹÓõÄÊý¾ÝÔöÇ¿·½·¨
»ùÓڶԱȵÄ×Լල·½·¨·¢Õ¹Ê·ÈçÏÂͼËùʾ£¬Ñ¡È¡Á˼¸¸öÊܹØ×¢½Ï¶àµÄ·½·¨£¬Ê±¼ä½ØÖ¹2021Äê3Ô¡£FacebookºÍgoogleÁ½¼ÒÑо¿ÍŶÓÉñÏÉ´ò¼Ü£¬¶Ô±Èѧϰ¿ò¼ÜÖð½¥È¥³ýµôһЩ¼¼ÇÉ¡¢½á¹¹£¬³¯×ÅÖйúÕÜѧ¡°´óµÀÖÁ¼ò¡±ÕâÒ»¸ÅÄîÇ°½ø¡£
ͼ3 ×Լල¶Ô±Èѧϰ·¢Õ¹Àú³Ì
»»Ò»¸ö½Ç¶È˼¿¼£¬Èç¹ûÅ×ÆúÏÂÓÎÈÎÎñµÄfinetune£¬Ö»¹Ø×¢ÓÚÌæ´úÈÎÎñµÄѧϰ£¬ÄÇô×Լලѧϰ¾ÍÏñÒ»¸ö´óȾ¸×£¬¸÷ÖÖÌæ´úÈÎÎñÖ»Òª¿ÉÒÔ¹¹Ôì³öÀ´£¬½«ÆäǶÈëµ½×Լලѧϰ¿ò¼ÜÄÚ£¬×îÖÕѧϰ³öÀ´µÄÌØÕ÷ºÍÍøÂ磬¾Í¾ßÓÐÁËÌæ´úÈÎÎñµÄÅбðÐÔ¡£ÓÉ´Ë£¬¾ÍÏñʹÓÃħ·¨Ò»Ñù£¬ÎÒÃǾÍÄܹ»ÊµÏÖ¶ÔÉñ¾ÍøÂçÄÜÁ¦µÄ¶¨ÖÆ»¯¸ÄÔì¡£µ±Ç°ÒѾÓв»ÉÙÑо¿³É¹û·¢±í£¬¿ÉÒÔʹÓÃ×ԼලÍê³ÉÖ¡ÐòÁÐÔ¤²â¡¢ÊÓƵ²¥·ÅËÙ¶ÈÅжϡ¢Í¼ÏñÐýת·½ÏòÔ¤²âµÈ¡£
¶þ¡¢×ԼලѧϰÓëÆäËûÁìÓòµÄ¹ØϵºÍ˼¿¼
ÓÉÓÚ¶Ô±ÈѧϰµÄÇ¿¾¢·¢Õ¹ÊÆÍ·ºÍÆäÔÚ×ԼලÁìÓòÖÐÕ¼Óеľø¶Ô±ÈÖØ£¬±¾ÎĽÓÏÂÀ´Ö±½ÓÒÔ¶Ô±Èѧϰ´úÌæ×ԼලѧϰµÄ˵·¨£¬ÔÚÉîÍÚ¶Ô±Èѧϰ¿ò¼Ü¹ý³ÌÖУ¬·¢ÏÖÆäÓëCVµÄ¸÷ÁìÓòÆäËû·½·¨ÈçÕôÁóѧϰ¡¢±íʾѧϰµÈÓÐÏàËÆ»ò¹ØÁªÖ®´¦£¬ÏÂÃ潫Öð¸öÌÖÂÛ¡£
1.¶Ô±ÈѧϰºÍÕôÁóѧϰ
¶þÕßµÄÍøÂç½á¹¹ÐÎʽ·Ç³£ÏàËÆ£¬Í¬ÑùÊÇ˫·ÍøÂç½á¹¹£¬Í¬ÑùÊǶÔÓÚ×îÖÕµÄ˫·ÍøÂçÊä³ö¼ÆËãloss¡£²»Í¬µÄÊÇ£¬ÕôÁóѧϰÍùÍùÊǹ̶¨Ò»¸öteacherÍøÂ磬studentÍøÂç¹æģҪСÓÚteacher£¬¶Ô±ÈѧϰÖУ¬Á½¸öÍøÂç½á¹¹³£³£Ò»Ö£¬²¢ÇÒÊǹ²Í¬¸üÐÂÍøÂç²ÎÊý£¬ÕôÁóѧϰÖÐteacherÍøÂçµÄ²ÎÊý¹Ì¶¨¡£µ±È»»¹ÓÐÊäÈë¡¢loss¡¢²ÎÊý¸üеȲ»Í¬£¬µ«ÕôÁóÍøÂçÌṩ¸øÁËÎÒÃÇÀí½â¶Ô±Èѧϰ¼Ü¹¹µÄÁíÒ»ÖÖ˼¿¼·½Ê½¡£ÔÚ¶Ô±ÈѧϰÖг£ÓõÄmomentum
updateµÄ¸üз½·¨ºÍstop
gradient¼¼ÇÉ£¬¿ÉÒÔÀí½â³ÉÕôÁóѧϰµÄ»ºÂý¸üÐÂteacher°æ±¾ºÍ±äÌ壬ÓÉ´ËÎÒÃÇ¿ÉÒÔ½«¶Ô±ÈÍøÂçÀí½â³É˫·ÍøÂ绥Ïàѧϰ£¬×óÓÒ»¥²«¡£ÉõÖÁ£¬ÂÛÎÄDINO[4]Öн«ÍøÂç½á¹¹Í¼ÖеÄÁ½¸ö·ÖÖ§Ö±½Óд³ÉÁËteacherºÍstudent¡£
ͼ4 DINOËã·¨ÍøÂç½á¹¹
2.¶Ô±ÈѧϰºÍ±íʾѧϰ
¶Ô±ÈѧϰÊôÓÚ±íʾѧϰµÄÒ»ÖÖ·½·¨£¬Í¨¹ý¶Ô±Èѧϰ»ñµÃµÄÌØÕ÷£¬Ç¨ÒƵ½ÏÂÓÎÈÎÎñÖУ¬½øÐÐfinetune¼´¿É´ïµ½¼à¶½Ñ§Ï°µÄЧ¹û£¬Ïñ¼«ÁËÔçÆÚCVÁìÓòµÄÊÖ¹¤ÌØÕ÷¡£¶Ô±ÈѧϰµÄËðʧº¯ÊýÉèÖÃÒ²ÊDZíʾѧϰµÄ³ö·¢µã£¬ÏàËÆÑù±¾ÔÚÌØÕ÷¿Õ¼äµÄ¾àÀëÒÀÈ»Ïà½ü£¬·´Ö®¾àÀë½ÏÔ¶¡£¼à¶½Ñ§Ï°ÍøÂçÒ²ÊÇѧϰµ½Á˺ܺõÄÌØÕ÷±íʾ£¬²Å¶ÔÎÒÃǵķÖÀàµÈÈÎÎñÓнϺõıíÏÖ¡£¶øÏÖÔÚ¶Ô±ÈѧϰҪ×öµÄ£¬¾ÍÊÇÔÚÎÞ±êÇ©µÄ»ù´¡ÉÏ£¬Ñ§Ï°µ½Ò»ÖÖ·º»¯ÐÔ¸üÇ¿µÄÌØÕ÷±íʾ¡£¿ÉÒÔÔ¤¼ûµÄÊÇ£¬ÎÒÃÇ¿ÉÒÔ½«¶Ô±ÈѧϰģÐÍÌæ»»µôimagenetԤѵÁ·Ä£ÐÍ×÷Ϊ¸÷ÀàÈÎÎñѵÁ·µÄÆðµã£¬ÒòΪ¶Ô±ÈѧϰµÄѵÁ·¼¯¹æÄ£¿ÉÒÔÇáËɳ¬Ô½imagenet£¬²¢ÇÒѵÁ·µÃµ½Êdz¬Ô½·ÖÀàÈÎÎñµÄ¸ü¾ß·º»¯µÄÌØÕ÷±íʾ¡£
ͼ5 ¼à¶½Ñ§Ï°µÄÁ÷³Ì
3.¶Ô±ÈѧϰºÍ×Ô±àÂëÆ÷
×Ô±àÂëÆ÷Ò²ÊÇÎ޼ලÁìÓòͼÏñÌØÕ÷ÌáÈ¡µÄÒ»ÖÖ·½Ê½£¬¸Ã·½·¨»ùÓÚÒ»¸ö±àÂëÆ÷£¨encoder£©½«ÊäÈëÓ³ÉäΪÌØÕ÷£¬ÔÙͨ¹ý½âÂëÆ÷£¨decoder£©½«Ó³ÉäµÄÌØÕ÷»Ö¸´µ½Ôͼ£¬ÒÔ¼õСÖع¹Îó²îΪѵÁ·Ä¿±ê¡£
ͼ6 ×Ô±àÂëÆ÷ÍøÂç½á¹¹Ê¾Òâ
×Ô±àÂëÆ÷µÄ±àÂë¹ý³Ì¿ÉÒÔ¿´×÷ÊǶԱÈѧϰµÄµ¥¸ö·ÖÖ¦½á¹¹£¬¶þÕßµÄÇø±ðÔÚÓÚ×Ô±àÂëÆ÷ͨ¹ýÖع¹Êä³öÀ´×÷Ϊ×ԼලÐÅÏ¢²¢±ÜÃâƽ·²½â£¬¶ø¶Ô±ÈÍøÂçÊÇÒÀ¿¿Á½Â·ÍøÂçµÄÊä³ö¶Ô±È½â¾öÎÊÌâ¡£´ÓÌáȡͼƬÌØÕ÷À´¿´£¬¶Ô±Èѧϰֱ½Ó¶ÔÌáÈ¡µÄÌØÕ÷×öÔ¼ÊøÓÅ»¯£¬±£³ÖÁËÔÚǶÈë¿Õ¼äÖÐÌØÕ÷·Ö²¼µÄAlignment£¨ÏàËÆʵÀýÓÐÏà½üµÄÌØÕ÷£©ºÍUniformity£¨±£Áô¸ü¶àµÄÐÅÏ¢£¬·Ö²¼¾ùÔÈ£©¡£´ËÍ⣬Èç¹ûÁ½ÖÖ·½Ê½×öÒ»ÖÖ½áºÏÒ²²»Ê§ÎªÒ»ÖÖ¿ÉÒÔ³¢ÊԵķ½Ïò£¬Ä§·¨²»Ò»¶¨Òª´ò°Üħ·¨£¬Á½ÖÖħ·¨µÄ¼Ó³ÉÒ²¿ÉÄÜ´´ÔìÉñÆæµÄÊÀ½ç¡£
4.¶Ô±ÈѧϰºÍ×ÔÈ»ÓïÑÔ´¦Àí
×ÔÈ»ÓïÑÔ´¦Àí£¨NLP£©ÁìÓò×ԼලѧϰµÄ³É¹¦£¬ÊǶÔCVÁìÓò¶Ô±ÈѧϰÈȳ±µÄÒýÁì¡£´ÊÏòÁ¿£¨Word2Vec£©µÈ·½·¨µÄ³É¹¦£¬ÔÚÊÓ¾õÁìÓòÄÜ·ñ³É¹¦¸´¿Ì£¬Çý¶¯×Å´ó¼ÒÏò×ԼලÊÓ¾õ·½Ïò½øÐÐ̽Ë÷¡£
¶þÕßÒ²Óв»Í¬Ö®´¦£¬µ¥´Ê»ò¶ÌÓïµÄÊýÁ¿ÊÇÓÐÇîµÄ£¬¶øͼƬµÄÊýÁ¿ÔòÊÇÎÞÇîµÄ£¬Óï¾ä¿ÉÒÔͨ¹ýÑÚĤ£¨mask£©µÈ·½Ê½¹¹Ôì³ö¸÷ÖÖÀàÐ͵ı仯£¬Í¼Æ¬ÁìÓòµÄ±ä»¯ÈçºÎ¸ßЧµØ»ñµÃÑù±¾¶Ô²¢ÇÒÓÐÀûÓÚÏÂÓÎÈÎÎñµÄЧ¹ûÌáÉý¶¼ÊÇÒª½â¾öºÍÓÅ»¯µÄÎÊÌâ¡£Ò²Óи÷Àà¼òµ¥µÄÓ¦ÓÿÉÒÔÖ±½Ó½øÐÐǨÒÆ£¬±ÈÈçALBERT[5]Ìá³öÁ˾ä×Ó˳ÐòÔ¤²â£¨SOP£©ÈÎÎñ¿ÉÒÔÖ±½ÓǨÒƵ½ÊÓƵƬ¶ÎµÄ˳ÐòÔ¤²âÉÏÀ´¡£
5.¶Ô±ÈѧϰºÍÉú³É¶Ô¿¹ÍøÂ磨GAN£©
ÎÊ£º¶Ô±ÈѧϰºÍGAN»¹Äܳ¶ÉϹØϵ£¿
´ð£ºÄúºÃ£¬Óеġ£
Çë¿´À´×ÔÓÚvideoMoCo[6]ÎÄÕµÄÍøÂç¼Ü¹¹£¬ÆäÖУ¬Ê¹ÓÃÉú³ÉÆ÷×÷ΪÏàËÆÑù±¾¶ÔµÄÉú³É·½Ê½£¬ÅбðÆ÷¾ÍÊǶԱÈѧϰµÄ¿ò¼Ü¡£¿ÉÒÔ˵£¬GANÖеÄÅбðÆ÷·Ö±æÕæ¼ÙµÄÈÎÎñºÍ¶Ô±ÈѧϰÖеÄÅбðÕý¸ºÑù±¾¶ÔµÄÈÎÎñ»ù±¾Ò»Ö¡£
ËäÈ»videoMoCoÔÚÕâÀïʹÓõÄÉú³ÉÆ÷·½Ê½±È½Ïnaive£¬µ«ÊǸøÎÒÃÇ¿ªÀ«Á˾޴óµÄÏëÏó¿Õ¼ä¡£¶Ô±ÈѧϰµÄÄѵãÖ®Ò»¾ÍÊÇÈçºÎ¹¹ÔìÌæ´úÈÎÎñ£¬µ±Ç°¸÷Àà¶Ô±Èѧϰ·½·¨¶¼ÊDzÉÓûúеµÄÊý¾ÝÔöÇ¿À´Íê³É£¬Èç¹ûʹÓÃÍøÂçÀ´Íê³ÉÕý¸ºÑù±¾¶ÔµÄ±êÇ©Éú³É£¬ÊDz»ÊÇÄÜ´Ù½ø¶Ô±ÈѧϰµÄЧ¹ûÌáÉý£¬ÉõÖÁÀ©´ó¶Ô±ÈѧϰµÄÓ¦Ó÷¶Î§¡£ÍòÎï½Ô¿É¶Ô±È£¬Ö»ÒªÄܹ»Éú³É¡£
ͼ7 videoMoCoËã·¨ÍøÂç½á¹¹
6.¶Ô±ÈѧϰºÍ¶ÈÁ¿Ñ§Ï°¡¢Í¼Ïñ¼ìË÷
ͨ¹ýÓëÑо¿¶ÈÁ¿Ñ§Ï°µÄͬʽ»Á÷£¬´ÓÏà¹ØÍøÂçËã·¨ºÍËðʧº¯ÊýÀ´¿´£¬¶Ô±ÈѧϰºÍ¶ÈÁ¿Ñ§Ï°¹ØϵÃÜÇУ¬»òÕßÖ±½Ó¿´³ÉÊÇͬһ¸ÅÄîµÄÁ½Öֳƺô£¬Ä¿±ê¶¼ÊÇʹѧϰµ½µÄÌØÕ÷ÏàËƶÔÏó¼ä¾àÀëС£¬²»ÏàËƶÔÏó¼ä¾àÀë´ó¡£ÏÖÔÚ¶Ô±ÈѧϰÁìÓò´ó¶àʹÓÃInfoNCEËðʧº¯Êý£¬¶ø¶ÈÁ¿Ñ§Ï°ÓõĶàÖÖËðʧ»¹ÏÊÓÐÉæ¼°£¬½«ÕâЩËðʧÒýÓùýÀ´Ò²ÊÇÓпÉÄܽøÒ»²½ÓÅ»¯µÄ·½Ïò¡£
ͼÏñ¼ìË÷ÊÇÎÒÃdz¢ÊÔ½«¶Ô±Èѧϰ×÷Ϊʵ¼ÊÓ¦ÓõÄÖØÒªÁìÓò£¬¶Ô±Èѧϰ¿ÉÒÔÌìÈ»µØµÃµ½Í¼Ïñembeeding£¬²¢ÇÒÒ²¾ßÓÐÅбðÏàËÆͼÏñ»òÕß·ÇÏàËÆͼÏñµÄÌص㣬ÔÚijЩ¼ìË÷ÐèÇóÏ£¬ÊÇÍêÃÀµÄÂäµØÓ¦Óá£ÎÒÃÇÒ²³¢ÊÔ¹ý½«¶Ô±ÈѧϰģÐͺÍArcFaceѵÁ·µÄÄ£ÐÍ×ö¶Ô±È£¬¶þÕßÔÚembeddingÖ®ºóÓ¦ÓÃÓÚͼÏñ¼ìË÷ÖУ¬¼òµ¥ÑéÖ¤µÄ²îÒì²¢²»´ó£¬ÔÚÄ£ÐÍÊÊÓ¦ÐÔÉÏ£¬ÔʼµÄÊý¾ÝÔöÇ¿¶àÑùÐÔ´øÀ´µÄÓ°Ïì¸ü´ó¡£
Èý¡¢¶Ô±È×ԼලѧϰµÄ·¢Õ¹Ç÷ÊÆ
1.»¯·±Îª¼ò֮·
´ó¼ÒÔÚ¿´Ö®Ç°µÄһЩ¶Ô±ÈѧϰÑо¿ÂÛÎÄ¿ÉÄÜÓÐһЩÒÉ»ó£¬ÎªÊ²Ã´stop
gradient»áÆð×÷Óã¬momentumµÄ×÷ÓþßÌåÊÇʲô£¬ºÃÏñ²¢²»ÊÇÄÇôֱ¹Û¡£ºóÐø·½·¨ÖУ¬momentum
update±»ÉáÆú£¬¸ºÑù±¾Ò²¿ÉÒÔÉáÆú£¬¶øBarlow
Twins[7]Ôò´ó¿ª´óºÏ£¬ÉáÆú¸÷ÀàÆæ¼¼ÒùÇÉ£¬½«¶Ô±ÈѧϰÂäʵµ½×îÖ±¹ÛµÄ»¥Ïà¹Ø¾ØÕóÉÏ£¬¼ò½àµ½ÁîÈË×¥¿ñ¡£·´¹Û¸÷Àà·½·¨ºÍËðʧµÄ¸ù±¾¹é¸ù½áµ×¾ÍÊÇ»¥Ïà¹Ø¾ØÕ󣬻¥Ïà¹Ø¾ØÕó¼ò½àµÄ´¦ÀíÁËÑù±¾¶ÔµÄ²ÉÑù·½Ê½£¬Ïà±ÈÆäËûËã·¨¾ßÓиü¸ßЧµÄÊý¾Ý²ÉÑù·½Ê½ºÍÊý¾Ý¹æÄ££¬Ö®Ç°µÄ¸÷Àà·½·¨¾ÍÏñÔÚµÐÈ˵ÄÐÄÔàÖÜΧ²»¶Ï»ÓÎ裬¶øBarlow
Twins¾ÍºÃÏñ½£¿ÍÖ±´Ìµ½µÐÈ˵ÄÐÄÔà¡£µ±È»Æä²ÉÓõÄ8192¸ßά¶ÈµÄÓ³Éä²ãÒ²ÊÇÖµµÃÌÖÂÛµÄÎÊÌâ¡£
¸ºÑù±¾µÄÊýÁ¿¶ÔÓÚÌØÕ÷µÄѧϰÊÇÊ®·ÖÖØÒªµÄÒѾÊǶԱÈѧϰÖеĹ²Ê¶¡£´Ë·½·¨ÒÔ½µµÍ¸÷ÌØÕ÷ά¶ÈµÄÈßÓàΪĿ±ê£¬»»Ò»¸ö˼¿¼·½Ê½£¬ÔÛÃÇ¿ÉÒÔ½«»¥Ïà¹Ø¾ØÕóת»»ÎªÅú´ÎÄÚͼÏñµÄÏàËƾØÕó£¬ÒÔ´Ë»ñµÃ´ó¹æÄ£µÄ¸ºÑù±¾Êý¾ÝÒÔÌáÉýÄ£ÐÍЧ¹û£¬²»ÔÙÊÜÏÞÓÚÓ²¼þµÄÏÞÖÆÒ²ÄÜÍê³ÉÒ»¸ö¸ßЧµÄ¶Ô±ÈÄ£ÐÍѵÁ·¡£µ±È»ÕâÖÖ·½·¨µÄÏÈÑé¼ÙÉè¾ÍÊÇ£¬Í¬Ò»¸öÅú´ÎÄÚµÄͼÏñ£¬¶¼ÊÇ»¥Îª¸ºÑù±¾¶ÔµÄ¡£
Ëðʧº¯ÊýµÄʹÓÃÒ²Óлع鴫ͳµÄÒâ棬ÒÔÏ·ֱðÊÇYann LeCunÓÚ2006ÄêÂÛÎÄ[13]ÖÐʹÓõÄËðʧº¯ÊýºÍBarlow
TwinsµÄËðʧº¯Êý£¬Äã³ò³òÕâÁ©ËðʧÏñ²»ÏñtwinsÄØ£¿
2006ÄêÌá³öµÄ¶Ô±ÈËðʧ
Barlow TwinsʹÓõĻ¥Ïà¹Ø¾ØÕó¶Ô±ÈËðʧ
2.Transformer or MLP£¿
2021 Äê4Ô³õ£¬³ÂöÎÀÚ£¬ºÎâýÃ÷µÈ´óÉñÓÖ·¢²¼ÁËMoCo V3[8]°æ±¾µÄ×Լල·½·¨£¬½«Visual
Transformers£¨ViT£©ÒýÈëµ½¶Ô±ÈѧϰÖÐÀ´¡£4Ôµף¬DINO[4]ÂÛÎÄ·¢²¼£¬Ö¸³öÁË×ԼලµÄViTÌØÕ÷°üº¬Ã÷ÏÔµÄÓïÒå·Ö¸îÐÅÏ¢£¬ÔÚÓмලµÄViTºÍ¾í»ýÍøÂçÖж¼Ã»ÓÐÀàËƵıíÏÖ¡£
ͼ8 DINOËã·¨·Ö¸îЧ¹ûչʾ
ÔÚÊÓ¾õÁìÓò£¬´óÓÐTransformerÈ¡´ú¾í»ýÍøÂçµÄÇ÷ÊÆ£¬ºÃÏñÒ»¸ö³õ³ö鮵ÄÄêÇáÈËÂÒÈ´òËÀÀÏʦ¸µ¡£²¢ÇÒÒѾÓɼòµ¥µÄͼÏñ·ÖÀà½ø¹¥µ½ÁË×ԼලѧϰÁìÓò£¬»¹±íÏÖ³öÁ˸üÀ÷º¦µÄÌØÐÔ£¬ÏàÐÅ»ùÓÚ×ԼලµÄTransformer»¹»áÓиü¶àµÄÑо¿³öÏÖ¡£
»òÕßÄù˜„ÖØÉúµÄMLP[9]µÈ·½·¨Ò²¿ÉÄÜÔÚ×ԼලÁìÓò´óÕ¹ÉíÊÖ£¬¶ÔÓ¦[9]µÄ±êÌ⣺MLP-Mixer: An all-MLP architecture for
vision£¬×ԼලϵÄMLP·½·¨ÌâÄ¿ÎÒ¶¼ÏëºÃÁË£ºAn all-MLP Architecture for self-supervised Learning¡£
3.¶Ô±È×ԼලÔÚÊÓƵÁìÓòµÄÓ¦ÓÃ
¶Ô±Èѧϰ·½·¨ÔÚÊÓƵÁìÓòµÄÓ¦ÓÃÒ²Óкܶ࣬ [10]½«²»Í¬²¥·ÅËٶȵÄӰƬÊäÈë¶Ô±ÈѧϰÍøÂ磬ѵÁ·Ä£ÐÍÓÃÓÚ²¥·ÅËÙ¶ÈÅб𣻱³¾°¼õ³ý(Background
Erasing [11])
ÔÚÊÓƵÿһ֡Öеþ¼Óµ±Ç°ÊÓƵÖеÄËæ»úÖ¡£¬ÒÔ´ïµ½¼õÈõ±³¾°¶ÔÓÚÄ£ÐÍÅжϵÄÓ°Ï죬Ìá¸ßÐÐΪʶ±ðµÄ׼ȷÐÔ£¬ÍøÂçÊäÈëΪÕý³£ÊÓƵºÍµþ¼ÓÖ¡ºóÊÓƵ£»[12]¶Ôͬһ¸öÊÓƵ²ÉÑù²»Í¬µÄƬ¶Î£¬½«Æä¿´×÷ÊÇÊÓƵµÄÊý¾ÝÔöÇ¿¼´ÕýÑù±¾¶ÔÊäÈëµ½ÍøÂçÖУ¬»ñµÃÊÓƵÌØÕ÷µÄ±íʾѧϰ¡£
µ±Ç°ÔÚÊÓƵÁìÓòµÄ¸÷ÖÖÓ¦ÓÃÖУ¬Ìæ´úÈÎÎñºÍÏÂÓÎÈÎÎñÒ»ÖµÄÏÖÏó±È½ÏÑÏÖØ£¬Ôì³ÉÄ£ÐÍÖ»ÄܶÔÌض¨ÈÎÎñ¾ßÓÐʶ±ðЧ¹û¡£Í¬Ê±£¬ÊÓƵµÄÌØÕ÷µÄ±íʾѧϰ£¬ÕÕ°áͼÏñ·½·¨µÄÏÖÏóÃ÷ÏÔ£¬½«2D¾í»ýÌ滻Ϊ3D¾í»ý¼´¿É×öǨÒÆ£¬Ïà¹ØÑо¿»¹´¦ÓÚÆ𲽽׶Σ¬¸öÈËÈÏΪÊÓƵÐòÁеÄÌØÕ÷ÌáÈ¡¿ÉÒÔÕë¶ÔÆäʱ¼äά¶ÈµÄÌØÊâÐÔ×öһЩרÃŵŤ×÷¡£
ÊÓƵ±íʾѧϰµÄ½ø²½£¬±Ø½«Íƶ¯ÊÓƵ¼ìË÷ÁìÓòµÄ·¢Õ¹¡£ÔÚÊÓƵ¼ìË÷ÁìÓò£¬¿ÉÒÔͨ¹ý×ԼලѧϰµÄ·½Ê½¹¹½¨¼ìË÷ÒÔÊÓƵËÑÊÓƵµÄ¼ìË÷·½·¨£¬Ò²¿ÉÒÔ×ö¿çģ̬µÄÊÓƵ¼ìË÷£¬±ÈÈçÒÔÎı¾ËÑÊÓƵ£¬ÒÔÓïÒôËÑÊÓƵµÈ¡£·´¹ýÀ´³©Ï룬ÊÓƵҲ¿ÉÒÔÉú³ÉÎı¾¡¢ÊÓƵÉú³ÉÓïÒô¡£
4.¼à¶½µÄ¶Ô±Èѧϰ
±¾À´ÔÚ×ԼලÁìÓò´ó·ÅÒì²ÊµÄ¶Ô±Èѧϰ£¬»¹¿ÉÒÔÓ¦ÓÃÔڼලѧϰÁìÓò£¬ÂÛÎÄ[14]×öµ½ÁËÕâÒ»µã¡£×ԼලÁìÓòÖжԱÈѧϰµÄÒÀ¾ÝÊÇ£¬Á½ÕÅͼƬÊÇ·ñͬԴ£¬¶øºÍ¼à¶½Ñ§Ï°µÄ½áºÏ£¬±ä³ÉÁËÁ½ÕÅͼƬÊÇ·ñͬÀà¡£ÔÚʹÓüල¶Ô±ÈËðʧºó£¬»ñµÃÁ˳¬¹ý½»²æìصıíÏÖ¡£
ͼ9 ×Լල¶Ô±ÈºÍ¼à¶½¶Ô±È
²»¹ý£¬¸Ã·½·¨µÄºËÐÄ»¹ÊDzÉÓÃÁ˶ԱÈѧϰµÄ·½Ê½ÑµÁ·ÁËÌáÈ¡embeddingµÄÍøÂ磬¶øºó½«ÌØÕ÷ÌáÈ¡ÍøÂ綳½á£¬ÑµÁ·ÁËÈ«Á¬½ÓµÄ·ÖÀàÍøÂç¡£´Ó±¾ÖÊÉÏÀ´½²£¬Óë×ԼලµÄÍøÂçǨÒƵ½ÏÂÓÎÈÎÎñÊÇÒ»Öµġ£¹Ø¼üÔÚÓÚÌæ´úÈÎÎñµÄ¹¹½¨£¬ÎüÄÉÁËÓмලÊý¾ÝµÄÐÅÏ¢¡£ÔÙ´ÎÑéÖ¤ÁË×ԼලѧϰµÄħ·¨¹â»·£¬Ò²Ö¤Ã÷Á˶ԱÈËðʧÏà¶ÔÓÚ·ÖÀཻ²æìØËðʧ£¬ÔÚÌáÈ¡ÓÐЧÌØÕ÷·½ÃæµÄÓÅÐãÄÜÁ¦¡£
ËÄ¡¢Ò»Ð©Ë¼¿¼
1.ÀíÂÛÔÀí
¾¡¹Ü×ԼලѧϰȡµÃÁ˺ܺõÄЧ¹û£¬µ«Æä±³ºóµÄÊýѧÔÀíºÍÀíÂÛ»ù±¾²¢Ã»ÓÐÌرðÔúʵ£¬´ó¶àͨ¹ýʵÑé½á¹û·´ÍÆÄ£ÐͽṹºÍ²ßÂÔµÄЧ¹û£¬¿ÉÄÜÔì³ÉºÜ¶àÑо¿×ßÁËÍä·£¬´ÓÀíÂÛ»ù´¡³ö·¢£¬Ö±´ï×îÖÕÄ¿±êµÄЧ¹û¿ÉÄÜ»á¸üºÃ¡£
2.Ìæ´úÈÎÎñµÄ¹¹½¨
µ±Ç°Ìæ´úÈÎÎñµÄ¹¹½¨ÌرðÊÇÊÓƵ·½Ïò£¬¶àÓëÏÂÓÎÈÎÎñΪÖ÷µ¼£¬Ã»ÓÐÌض¨µÄ·¶Ê½»òÕß¹æÔò¡£Ìæ´úÈÎÎñËùÄÜÍê³ÉµÄÈÎÎñ£¬¾ÍÊÇ×ԼලģÐÍÄÜÍê³ÉÈÎÎñµÄ±ß½ç¡£Ìæ´úÈÎÎñµÄÎ廨°ËÃÅ£¬µ¼Ö¸÷ÀàÈÎÎñµÄǧ²îÍò±ð£¬Ã»Óа취±È½ÏÐÔÄÜÓÅÁÓ£¬Ö»ÄÜÊǵ¥´¿µÄÍøÂçÔÚÁíÒ»¸öÈÎÎñÉϵÄÓ¦Ó㬵±Ç°Í¼Æ¬ÁìÓò¶à»ùÓÚ¶àÖÖÊý¾ÝÔöÇ¿·½·¨¹¹½¨Ìæ´úÈÎÎñ£¬¶øÊÓƵÁìÓòÒ²¿ÉÒÔÌá³öͳһµÄ¹¹½¨·½Ê½¡£
Äܹ»Í¨¹ý¡°°ë×Ô¶¯¡±·½Ê½×ö³öÀ´µÄÌæ´úÈÎÎñÉÙÖ®ÓÖÉÙ£¬ÔÚ¸÷ÀàµÄͼÏñËã·¨Ó¦ÓÃÖУ¬¿ÉÄÜÊÇÓ°Ïì×Լල·½·¨ÊÊÓ¦ÐԵİí½Åʯ¡£
3.ÄÜ·ñ¹¹½¨Ö±Í¨ÏÂÓÎÈÎÎñµÄ¶Ëµ½¶Ëѧϰ
¼ÈÈ»[4]ÖÐÒѾ·¢ÏÖ×ԼලÖÐÓÐÃ÷ÏÔµÄÓïÒå·Ö¸îÌØÕ÷£¬ÔÚ¶Ô±ÈÄ£Ðͺó¶Ë¼ÓÈë·Ö¸î·ÖÖ§ÍøÂç»á²»»á¶ÔÍøÂçѧϰÓаïÖú£¬ÒÖ»òÊÇÖ±½ÓѵÁ·µÃµ½¿ÉʹÓõķָîÍøÂ磬¶¼ÊÇÖµµÃÑо¿µÄÎÊÌâ¡£
4.³ý¶Ô±ÈµÄÆäËûÐÎʽ¹¹½¨ÌØÕ÷ÌáÈ¡ÍøÂç
±¾ÖÊÉÏ£¬¶Ô±ÈÍøÂçÊdzýÈ¥³£¹æÍøÂçÖ®Í⣬ѵÁ·µÃµ½ÌØÕ÷±íʾµÄÒ»ÖÖ·½Ê½¶øÒÑ£¬ÓëÇ°ÎÄÌáµ½µÄ×Ô±àÂëÆ÷ÓÐÒìÇúͬ¹¤Ö®Ãî¡£¶Ô±ÈѧϰµÄ³É¹¦ÔÚÓÚ£¬ÆäѵÁ·µÃµ½µÄÌØÕ÷ÌáÈ¡ÍøÂ磬ÔÚÏÂÓÎÈÎÎñÖбíÏÖÓÅÒ죬ҲÊÇËùÌáÌØÕ÷ÓÐЧµÄ±íÏÖ¡£ÓÉ´ËÎÒÃÇ¿ÉÒԵõ½Æô·¢£¬»¹ÓÐûÓÐÆäËûµÄÐÎʽ¹¹½¨ÑµÁ·ÍøÂ磬ҲÄܹ»ÌáÈ¡µÃµ½ÓÐЧÌØÕ÷¡£ÏàÐÅÐÂģʽµÄÌá³ö¿Ï¶¨Ò²»áºÍ¶Ô±ÈѧϰһÑù£¬ÒýÁìÒ»²¨Ñо¿À˳±¡£
5.¹ãÀ«ÌìµØ£¬´óÓпÉΪ
×Լලѧϰ»¹´¦ÓÚ̽Ë÷½×¶Î£¬Óкܶà¿ÉÒÔÉîÈë̽¾¿µÄ²¿·Ö£¬ÏàÐÅÎÞÂÛÔÚѧÊõ½çºÍ¹¤Òµ½ç×Լලѧϰ¶¼»áÓй㷺µÄÓ¦Óá£×÷ΪÉî¶ÈѧϰÖеÄÒ»ÖÖħ·¨£¬»¹ÐèÒª¸ü¶àµÄÈËÀ´ÍÚ¾òÆäDZÄÜ£¬´´Ôì¸ü¶àµÄÉñ¼£¡£
×ܽá
±¾ÎÄÕë¶Ôµ±Ç°ÈÈÃŵÄ×ԼලѧϰÁìÓòÔÚCVÁìÓòµÄÑо¿£¬ÊáÀíÁËÆäÓëÆäËûCVµÄÏàͬºÍ²»Í¬µã£¬ÒÔ¼°¼¸¸öÇ°ÑØÑо¿µãµÄ̽ÌÖ¡£Ï£Íûͨ¹ý±¾ÎÄ£¬´ó¼Ò¶Ô×ԼලѧϰµÄÓиö¸ü¼ÓÃ÷È·µÄ¶¨Î»£¬Èç¹û¶ÔÓÚÄúµÄÑо¿ºÍ˼·ÓÐЩÐí°ïÖú£¬½«ÊÇ×÷Õߵĸü´óÐÀο¡£
¡ª¡ª¡ª¡ª¡ª¡ª¡ª¡ª¡ª¡ª¡ª¡ª¡ª¡ª¡ª¡ª¡ª¡ª¡ª¡ª¡ª¡ª¡ª¡ª¡ª¡ª¡ª¡ª¡ª¡ª¡ª¡ª¡ª
ÈËÃñÖпÆ(¼ÃÄÏ)ÖÇÄܼ¼ÊõÓÐÏÞ¹«Ë¾ÊÇÓÉÈËÃñÍøÓëÖпÆÔº×Ô¶¯»¯Ëù¹²Í¬½¨ÉèµÄ¡°ÖÇÄܼ¼ÊõÒýÇ桱ºÍ¡°È˲Ŵ´ÐÂƽ̨¡±£¬¾Û½¹ÒôÊÓƵÄÚÈÝÀí½â¼¼Êõ£¬Î§ÈÆ¡°ÄÚÈÝÀí½â+ÐÐÒµÓ¦Óá±£¬Ìṩ¶àÐÎ̬µÄÄÚÈÝÀí½âËã·¨¼¼Êõ¡¢Èí¼þϵͳ¡¢Ó²¼þ×°±¸µÈ£¬Ïò¸÷ÐÐÒµÊä³öרҵµÄAI¼¼Êõ½â¾ö·½°¸¡£
²Î¿¼ÎÄÏ×£º
[1] https://www.paperswithcode.com/task/self-supervised-learning
[2] Doersch C, Gupta A, Efros A A. Unsupervised visual representation
learning by context prediction[C]//Proceedings of the IEEE international
conference on computer vision. 2015: 1422-1430.
[3] Chen T, Kornblith S, Norouzi M, et al. A simple framework for contrastive
learning of visual representations[C]//International conference on machine
learning. PMLR, 2020: 1597-1607.
[4] Caron M, Touvron H, Misra I, et al. Emerging properties in
self-supervised vision transformers[J]. arXiv preprint arXiv:2104.14294, 2021.
[5] Lan Z, Chen M, Goodman S, et al. Albert: A lite bert for self-supervised
learning of language representations[J]. arXiv preprint arXiv:1909.11942, 2019.
[6] Pan T, Song Y, Yang T, et al. Videomoco: Contrastive video representation
learning with temporally adversarial examples[J]. arXiv preprint
arXiv:2103.05905, 2021.
[7] Zbontar J, Jing L, Misra I, et al. Barlow twins: Self-supervised learning
via redundancy reduction[J]. arXiv preprint arXiv:2103.03230, 2021.
[8] Chen X, Xie S, He K. An empirical study of training self-supervised
visual transformers[J]. arXiv e-prints, 2021: arXiv: 2104.02057.
[9] Tolstikhin I, Houlsby N, Kolesnikov A, et al. MLP-Mixer: An all-MLP
architecture for vision[J]. arXiv preprint arXiv:2105.01601, 2021.
[10] Wang J, Jiao J, Liu Y H. Self-supervised video representation learning
by pace prediction[C]//European Conference on Computer Vision. Springer, Cham,
2020: 504-521.
[11] Wang J, Gao Y, Li K, et al. Removing the Background by Adding the
Background: Towards Background Robust Self-supervised Video Representation
Learning[J]. arXiv preprint arXiv:2009.05769, 2020.
[12] Feichtenhofer C, Fan H, Xiong B, et al. A Large-Scale Study on
Unsupervised Spatiotemporal Representation Learning[J]. arXiv preprint
arXiv:2104.14558, 2021.
[13] Hadsell R, Chopra S, LeCun Y. Dimensionality reduction by learning an
invariant mapping[C]//2006 IEEE Computer Society Conference on Computer Vision
and Pattern Recognition (CVPR'06). IEEE, 2006, 2: 1735-1742.
[14] Khosla P, Teterwak P, Wang C, et al. Supervised contrastive learning[J].
arXiv preprint arXiv:2004.11362, 2021.
С³ÌÐò³¤ÁбíäÖȾÓÅ»¯
Nat. Commun.
ÃâÔðÉùÃ÷£º±¾ÎÄϵ»¥ÁªÍøתÔØ£¬±¾Õ¾²»±£Ö¤ÆäÄÚÈÝÕæʵÐÔÒ²²»Í¬ÒâÎÄÖй۵㣬Çë¶ÁÕß×ÔÐмø±ð£¬²¢ºËʵ¹ã¸æºÍÄÚÈÝÕæʵÐÔ£¬½÷É÷ʹÓ㬱¾Õ¾ºÍ±¾È˲»³Ðµ£Óɴ˲úÉúµÄÒ»Çз¨Âɺó¹û£¡»¥ÁªÍø½ðÈÚ·çÏÕ¾Þ´ó£¬Í¶×ÊÒª½÷É÷£¬ÈçÓÐÒìÒéÇëÁªÏµ021-54249915
2021Äê06ÔÂ21ÈÕ ÓÚÉϺ£
°æȨ×÷Æ· δ¾Ðí¿É ÇëÎðתÔØ ¡¡