'2025/07 글 목록

Notice

Recent Posts

Recent Comments

Link

« 2025/07 »
일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

Tags more

Archives

Today

Total

관리 메뉴

목록2025/07 (2)

JINWOOJUNG

3D Scene Graph: A Structure for Unified Semantics, 3D Space, and Camera

Paperhttps://arxiv.org/pdf/1910.02527https://3dscenegraph.stanford.edu/images/supp_mat.pdf Introduction객체 및 공간의 기하학적 구조, 카테고리(클래스), 특정 장면의 시점 등의 정보를 효과적으로 저장하는 것은 매우 중요한 문제이다. 이러한 정보를 저장하기 위한 이상적인 공간은 변화에 불변 즉, 전체적인 공간적 정보를 불변하게 모두 포괄해야 한다. 또한, 이미지나 비디오 등 다양한 도메인에 쉽고 결정론적으로 연결되어야 한다. 이러한 측면에서, 이미지는 이상적인 해결책이 되지 않는다. 이미지는 시점에 따른 제약이 존재하며, Depth/Size 등의 정보를 효과적으로 다루지 못한다. 따라서 본 논문에서는 3D Scene Gra..

NLP, LLM, Multi-modal 2025. 7. 17. 21:02

BAT: Learning to Reason about Spatial Sounds with Large Language Models

Paperhttps://arxiv.org/abs/2402.01591 BAT: Learning to Reason about Spatial Sounds with Large Language ModelsSpatial sound reasoning is a fundamental human skill, enabling us to navigate and interpret our surroundings based on sound. In this paper we present BAT, which combines the spatial sound perception ability of a binaural acoustic scene analysis model witharxiv.org IntroductionBLIP-2, CLIP..

NLP, LLM, Multi-modal 2025. 7. 4. 12:17

이전 Prev 1 Next 다음

목록2025/07 (2)

JINWOOJUNG

티스토리툴바