Event cameras are bio-inspired vision sensors that output pixel-level brightness changes instead of standard intensity frames. They offer significant advantages over standard cameras, namely a very high dynamic range, no motion blur, and a latency in the order of microseconds. We propose a novel, accurate tightly-coupled visual-inertial odom- etry pipeline for such cameras that leverages their outstanding properties to estimate the camera ego-motion in challenging conditions, such as high-speed motion or high dynamic range scenes. The method tracks a set of features (extracted on the image plane) through time. To achieve that, we consider events in overlapping spatio-temporal windows and align them using the current camera motion and scene structure, yielding motion-compensated event frames. We then combine these feature tracks in a keyframe- based, visual-inertial odometry algorithm based on nonlinear optimization to estimate the camera’s 6-DOF pose, velocity, and IMU biases. The proposed method is evaluated quantitatively on the public Event Camera Dataset  and significantly outperforms the state-of-the-art , while being computationally much more efficient: our pipeline can run much faster than real-time on a laptop and even on a smartphone processor. Fur- thermore, we demonstrate qualitatively the accuracy and robustness of our pipeline on a large-scale dataset, and an extremely high-speed dataset recorded by spinning an event camera on a leash at 850 deg/s.
Looking for publications? All our latest publications are listed here, and you can also use our search functions to help find what you are looking for. Power users might also want to consider searching on the EPFL Infoscience site which provides advanced publication search capabilities.
In this paper, we discuss the adaptation of our decentralized place recognition method described in  to fullimage descriptors. As we had shown, the key to making a scalable decentralized visual place recognition lies in exploting deterministic key assignment in a distributed key-value map. Through this, it is possible to reduce bandwidth by up to a factor of n, the robot count, by casting visual place recognition to a key-value lookup problem. In , we exploited this for the bagof-words method , . Our method of casting bag-of-words, however, results in a complex decentralized system, which has inherently worse recall than its centralized counterpart. In this paper, we instead start from the recent full-image description method NetVLAD . As we show, casting this to a key-value lookup problem can be achieved with k-means clustering, and results in a much simpler system than . The resulting system still has some flaws, albeit of a completely different nature: it suffers when the environment seen during deployment lies in a different distribution in feature space than the environment seen during training.
Event cameras are bio-inspired vision sensors that output pixel-level brightness changes instead of standard intensity frames. These cameras do not suffer from motion blur and have a very high dynamic range, which enables them to provide reliable visual information during high-speed motions or in scenes characterized by high dynamic range. These features, along with a very low power consumption, make event cameras an ideal complement to standard cameras for VR/AR and video game applications. With these applications in mind, this paper tackles the problem of accurate, low-latency tracking of an event camera from an existing photometric depth map (i.e., intensity plus depth information) built via classic dense reconstruction pipelines. Our approach tracks the 6-DOF pose of the event camera upon the arrival of each event, thus virtually eliminating latency. We successfully evaluate the method in both indoor and outdoor scenes and show that—because of the technological advantages of the event camera—our pipeline works in scenes characterized by high-speed motion, which are still unaccessible to standard cameras.
Biology motivates how complex functionality results from systems of simple materials.
The aim of this study was to test the feasibility and accuracy of a smartphone application to measure the body length of children using the integrated camera and to evaluate the subsequent weight estimates. A prospective clinical trial of children aged 0–<13 years admitted to the emergency department of the University Children’s Hospital Zurich. The primary outcome was to validate the length measurement by the smartphone application «Optisizer». The secondary outcome was to correlate the virtually calculated ordinal categories based on the length measured by the app to the categories based on the real length. The third and independent outcome was the comparison of the different weight estimations by physicians, nurses, parents and the app. For all 627 children, the Bland Altman analysis showed a bias of −0.1% (95% CI −0.3–0.2%) comparing real length and length measured by the app. Ordinal categories of real length were in excellent agreement with categories virtually calculated based upon app length (kappa = 0.83, 95% CI 0.79–0.86). Children’s real weight was underestimated by physicians (−3.3, 95% CI −4.4 to −2.2%, p < 0.001), nurses (−2.6, 95% CI −3.8 to −1.5%, p < 0.001) and parents (−1.3, 95% CI −1.9 to −0.6%, p < 0.001) but overestimated by categories based upon app length (1.6, 95% CI 0.3–2.8%, p = 0.02) and categories based upon real length (2.3, 95% CI 1.1–3.5%, p < 0.001). Absolute weight differences were lowest, if estimated by the parents (5.4, 95% CI 4.9–5.9%, p < 0.001). This study showed the accuracy of length measurement of children by a smartphone application: body length determined by the smartphone application is in good agreement with the real patient length. Ordinal length categories derived from app-measured length are in excellent agreement with the ordinal length categories based upon the real patient length. The body weight estimations based upon length corresponded to known data and limitations. Precision of body weight estimations by paediatric physicians and nurses were comparable and not different to length based estimations. In this non-emergency setting, parental weight estimation was significantly better than all other means of estimation (paediatric physicians and nurses, length based estimations) in terms of precision and absolute difference.
Event cameras offer many advantages over standard frame-based cameras, such as low latency, high temporal resolution, and a high dynamic range. They respond to pixel- level brightness changes and, therefore, provide a sparse output. However, in textured scenes with rapid motion, millions of events are generated per second. Therefore, state- of-the-art event-based algorithms either require massive parallel computation (e.g., a GPU) or depart from the event-based processing paradigm. Inspired by frame-based pre-processing techniques that reduce an image to a set of features, which are typically the input to higher-level algorithms, we propose a method to reduce an event stream to a corner event stream. Our goal is twofold: extract relevant tracking information (corners do not suffer from the aperture problem) and decrease the event rate for later processing stages. Our event-based corner detector is very efficient due to its design principle, which consists of working on the Surface of Active Events (a map with the timestamp of the lat- est event at each pixel) using only comparison operations. Our method asynchronously processes event by event with very low latency. Our implementation is capable of pro- cessing millions of events per second on a single core (less than a micro-second per event) and reduces the event rate by a factor of 10 to 20.