{"id":568,"date":"2023-12-13T08:23:30","date_gmt":"2023-12-13T08:23:30","guid":{"rendered":"https:\/\/faai.ath.edu.pl\/?p=568"},"modified":"2023-12-13T12:02:23","modified_gmt":"2023-12-13T12:02:23","slug":"detekcia-nezvycajnosti-signalu-ako-vnutorna-odmena-pre-robotiku","status":"publish","type":"post","link":"https:\/\/faai.ath.edu.pl\/?p=568&lang=sk","title":{"rendered":"Publik\u00e1cia: Detekcia nezvy\u010dajnost\u00ed sign\u00e1lu ako vn\u00fatorn\u00e1 odmena pre robotiku"},"content":{"rendered":"<p>V pokro\u010dilom riaden\u00ed robotov je posilnen\u00e9 u\u010denie be\u017enou technikou, ktor\u00e1 sa pou\u017e\u00edva na transform\u00e1ciu \u00fadajov zo senzorov na sign\u00e1ly pre ak\u010dn\u00e9 \u010dleny na z\u00e1klade sp\u00e4tnej v\u00e4zby z prostredia robota. Sp\u00e4tn\u00e1 v\u00e4zba alebo odmena je v\u0161ak zvy\u010dajne riedka, preto\u017ee sa poskytuje najm\u00e4 po dokon\u010den\u00ed alebo zlyhan\u00ed \u00falohy, \u010do vedie k pomalej konvergencii. \u010eal\u0161ie vn\u00fatorn\u00e9 odmeny zalo\u017een\u00e9 na frekvencii n\u00e1v\u0161tev stavu m\u00f4\u017eu poskytn\u00fa\u0165 viac sp\u00e4tnej v\u00e4zby. V tejto \u0161t\u00fadii bola ako detekcia novosti pre vn\u00fatorn\u00e9 odmeny na vedenie procesu preh\u013ead\u00e1vania stavov\u00e9ho priestoru vyu\u017eit\u00e1 neur\u00f3nov\u00e1 sie\u0165 s hlbok\u00fdm u\u010den\u00edm Autoencoder. Neur\u00f3nov\u00e1 sie\u0165 spracov\u00e1vala sign\u00e1ly z r\u00f4znych typov sn\u00edma\u010dov s\u00fa\u010dasne. Bola testovan\u00e1 na simulovan\u00fdch robotick\u00fdch agentoch v referen\u010dnom s\u00fabore testovac\u00edch prostred\u00ed klasick\u00e9ho riadenia OpenAI Gym (vr\u00e1tane Mountain Car, Acrobot, CartPole a LunarLander), pri\u010dom sa dosiahlo efekt\u00edvnej\u0161ie a presnej\u0161ie riadenie robota v troch zo \u0161tyroch \u00faloh (len s miernym zhor\u0161en\u00edm v \u00falohe Lunar Lander), ke\u010f sa pou\u017eili \u010disto vn\u00fatorn\u00e9 odmeny v porovnan\u00ed so \u0161tandardn\u00fdmi vonkaj\u0161\u00edmi odmenami. Za\u010dlenen\u00edm vn\u00fatorn\u00fdch odmien zalo\u017een\u00fdch na autoenkod\u00e9roch by sa roboty mohli sta\u0165 potenci\u00e1lne spo\u013eahlivej\u0161\u00edmi v auton\u00f3mnych oper\u00e1ci\u00e1ch, ako je prieskum vesm\u00edru alebo pod vodou, alebo po\u010das reakcie na pr\u00edrodn\u00e9 katastrofy. Syst\u00e9m by sa toti\u017e mohol lep\u0161ie prisp\u00f4sobi\u0165 meniacemu sa prostrediu alebo neo\u010dak\u00e1van\u00fdm situ\u00e1ci\u00e1m.<\/p>\n<p>Cel\u00fd dokument n\u00e1jdete pod t\u00fdmto odkazom:<a href=\"https:\/\/faai.ath.edu.pl\/wp-content\/uploads\/2023\/12\/Sensors-Kubovcik-Signal-novelty-detection-as-an-intrinsic-reward-for-robotics03d_SJ.pdf\">Sensors Kubovcik Signal novelty detection as an intrinsic reward for robotics03d_SJ<\/a><\/p>\n<p>alebo\u00a0<a href=\"https:\/\/doi.org\/10.3390\/s23083985\">https:\/\/doi.org\/10.3390\/s23083985<\/a><\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>V pokro\u010dilom riaden\u00ed robotov je posilnen\u00e9 u\u010denie be\u017enou technikou, ktor\u00e1 sa pou\u017e\u00edva na transform\u00e1ciu \u00fadajov zo senzorov na sign\u00e1ly pre ak\u010dn\u00e9 \u010dleny na z\u00e1klade sp\u00e4tnej v\u00e4zby z prostredia robota. Sp\u00e4tn\u00e1 v\u00e4zba alebo odmena je v\u0161ak zvy\u010dajne riedka, preto\u017ee sa poskytuje najm\u00e4 po dokon\u010den\u00ed alebo zlyhan\u00ed \u00falohy, \u010do vedie k pomalej konvergencii. \u010eal\u0161ie vn\u00fatorn\u00e9 odmeny zalo\u017een\u00e9 &hellip; <\/p>\n<p class=\"read-more\"><a class=\"btn btn-default\" href=\"https:\/\/faai.ath.edu.pl\/?p=568&#038;lang=sk\"> Read More<span class=\"screen-reader-text\">  Read More<\/span><\/a><\/p>\n","protected":false},"author":32,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[72],"tags":[],"class_list":["post-568","post","type-post","status-publish","format-standard","hentry","category-spravy"],"_links":{"self":[{"href":"https:\/\/faai.ath.edu.pl\/index.php?rest_route=\/wp\/v2\/posts\/568","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/faai.ath.edu.pl\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/faai.ath.edu.pl\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/faai.ath.edu.pl\/index.php?rest_route=\/wp\/v2\/users\/32"}],"replies":[{"embeddable":true,"href":"https:\/\/faai.ath.edu.pl\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=568"}],"version-history":[{"count":5,"href":"https:\/\/faai.ath.edu.pl\/index.php?rest_route=\/wp\/v2\/posts\/568\/revisions"}],"predecessor-version":[{"id":574,"href":"https:\/\/faai.ath.edu.pl\/index.php?rest_route=\/wp\/v2\/posts\/568\/revisions\/574"}],"wp:attachment":[{"href":"https:\/\/faai.ath.edu.pl\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=568"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/faai.ath.edu.pl\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=568"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/faai.ath.edu.pl\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=568"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}