vision(2)
-
[Generative] Llama 활용해 이미지에 대한 captioning, tagging하기
A. 수단- OS/Platform/Tool : Linux, Kubernetes(k8s), Docker, AWS- Package Manager : node.js, yarn, brew, - Compiler/Transpillar : React, Nvcc, gcc/g++, Babel, Flutter- Module Bundler : React, Webpack, ParcelB. 언어- C/C++, python, Javacsript, Typescript, Go-Lang, CUDA, Dart, HTML/CSSC. 라이브러리 및 프레임워크 및 SDK- OpenCV, OpenCL, FastAPI, PyTorch, Tensorflow, Nsight 1. What? (현상) LLaMA(Large Language Model ..
2024.12.27 -
[Multi-Modal Fusion] DeepFusion: Lidar-Camera Deep Fusion for Multi-Modal (CVPR'22)
Paper : https://openaccess.thecvf.com/content/CVPR2022/papers/Li_DeepFusion_Lidar-Camera_Deep_Fusion_for_Multi-Modal_3D_Object_Detection_CVPR_2022_paper.pdf Authors Google + Johns Hopkins Univ, CVPR’22 Main Idea Image와 Lidar간의 correspondence를 향상시킬 수 있는 방법 제시 Tasks : 3D Object Detection Results : Waymo 1. Problem : Mid-Level Fusion과 Point Decoration 기법의 문제점 2. Approach : InverseAug & LearnableAli..
2022.12.26