Quantcast
Channel: Raspberry Pi Forums
Viewing all articles
Browse latest Browse all 6057

Camera board • Re: When using libcamera accessing the camera buffer is very slow.

$
0
0
Another thing that causes quite a bit of slowdown is mmapping buffers on the fly on every request completion. A better approach would be mapping the buffers on startup, holding onto the buffer pointers, and doing a pointer look up when the buffer gets completed in the request when you want to access the data. This works because the buffers are allocated once on startup and recycled at runtime.

Again rpicam-apps and picamera2 do this as well.
I modified simple-cam to use the old mmapping code from rpicam-apps before using dmabufs, as shown below.

Code:

allocators = new FrameBufferAllocator(cameras);for (StreamConfiguration &cfg : *config) {Stream *stream = cfg.stream();if (allocator->allocate(stream) < 0)std::cerr << "Can't allocate buffers" << std::endl;for (const std::unique_ptr<FrameBuffer> &buffer : allocator->buffers(stream)){size_t buffer_size = 0;for (unsigned i = 0; i < buffer->planes().size(); i++){const FrameBuffer::Plane &plane = buffer->planes()[i];buffer_size += plane.length;if (j == buffer->planes().size() -1 || plane.fd.get() != buffer->planes()[i+1].fd.get()){void *memory = mmap(NULL, buffer_size, PROT_READ | PROT_WRITE, MAP_SHARED, plane.fd.get(), 0);mapped_buffers_[buffer.get()].push_back(Span<uint8_t>(static_cast<uint8_t*>(memory), buffer_size));buffer_size = 0;}}frame_buffers_[stream].push(buffer.get());}size_t allocated = allocator->buffers(cfg.stream()).size();std::cout << "Allocated " << allocated << " buffers for stream" << std::endl;}makeRequests();
And then in the processRequest function, I annotate the sensor framerate using OpenCV, as seen in the code below. I based the code on what the post processor stages are doing in rpicam-apps.

Code:

static void processRequest(Request *request){float framerate = 0;auto ts = request->metadata().get(controls::SensorTimestamp);uint64_t timestamp = ts ? *ts : request->buffers().begin()->second->metadata().timestamp;if (last_timestamp_ == 0 || last_timestamp_ == timestamp)framerate = 0;elseframerate = 1e9 / (timestamp - last_timestamp_);last_timestamp_ = timestamp;const Request::BufferMap &buffers = request->buffers();for (auto bufferPair : buffers) {const Stream *stream = bufferPair.first;StreamConfiguration const &cfg = stream->configuration();auto start = std::chrono::high_resolution_clock::now();auto it = mapped_buffers_.find(bufferPair.second);if (it == mapped_buffers_.end())throw std::runtime_error("failed to identify queue request buffer");libcamera::Span<uint8_t> buffer = it->second[0];uint8_t *ptr = (uint8_t *)buffer.data();cv::Mat image(cfg.size.height, cfg.size.width, CV_8UC1, ptr, cfg.stride);int font = cv::FONT_HERSHEY_SIMPLEX;std::string text = "Framerate: " + std::to_string(framerate);double adjusted_scale_ = 1.0 * cfg.size.width / 1200;double adjusted_thickness_ = std::max(2 * cfg.size.width / 700, 1u);int baseline = 0;cv::Size size = cv::getTextSize(text, font, adjusted_scale_, adjusted_thickness_, &baseline);int bg_ = 0;int fg_ = 255;double alpha_ = 0.3;// Can't find a handy "draw rectangle with alpha" function...for (int y = 0; y < size.height + baseline; y++, ptr += cfg.stride){for (int x = 0; x < size.width; x++)ptr[x] = bg_ * alpha_ + (1 - alpha_) * ptr[x];}cv::putText(image, text, cv::Point(0, size.height), font, adjusted_scale_, fg_, adjusted_thickness_, 0);auto stop = std::chrono::high_resolution_clock::now();auto duration = std::chrono::duration_cast<std::chrono::microseconds>(stop - start);if (mmapTimes.size() > 30){std::cout << std::reduce(mmapTimes.begin(), mmapTimes.end()) / mmapTimes.size() << "\n";mmapTimes.clear();}mmapTimes.push_back(duration.count());int fd = bufferPair.second->planes()[0].fd.get();}/* Re-queue the Request to the camera. */request->reuse(Request::ReuseBuffers);cameras[0]->queueRequest(request);}
On average, it takes 9 microseconds to complete. If instead I use the regular simple-cam buffers and mmap each time in processRequest, I get an average of 8 microseconds to complete.

Code:

static void processRequest(Request *request){float framerate = 0;auto ts = request->metadata().get(controls::SensorTimestamp);uint64_t timestamp = ts ? *ts : request->buffers().begin()->second->metadata().timestamp;if (last_timestamp_ == 0 || last_timestamp_ == timestamp)framerate = 0;elseframerate = 1e9 / (timestamp - last_timestamp_);last_timestamp_ = timestamp;const Request::BufferMap &buffers = request->buffers();for (auto bufferPair : buffers) {const Stream *stream = bufferPair.first;StreamConfiguration const &cfg = stream->configuration();auto start = std::chrono::high_resolution_clock::now();        FrameBuffer *buffer = bufferPair.second;   int fd = buffer->planes()[0].fd.get();        uint8_t *ptr = static_cast<uint8_t *>(mmap(NULL, buffer->planes()[0].length, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0));        cv::Mat image(cfg.size.height, cfg.size.width, CV_8UC1, ptr, cfg.stride);        int font = cv::FONT_HERSHEY_SIMPLEX;std::string text = "Framerate: " + std::to_string(framerate);double adjusted_scale_ = 1.0 * cfg.size.width / 1200;double adjusted_thickness_ = std::max(2 * cfg.size.width / 700, 1u);int baseline = 0;cv::Size size = cv::getTextSize(text, font, adjusted_scale_, adjusted_thickness_, &baseline);int bg_ = 0;int fg_ = 255;double alpha_ = 0.3;// Can't find a handy "draw rectangle with alpha" function...for (int y = 0; y < size.height + baseline; y++, ptr += cfg.stride){for (int x = 0; x < size.width; x++)ptr[x] = bg_ * alpha_ + (1 - alpha_) * ptr[x];}cv::putText(image, text, cv::Point(0, size.height), font, adjusted_scale_, fg_, adjusted_thickness_, 0);auto stop = std::chrono::high_resolution_clock::now();auto duration = std::chrono::duration_cast<std::chrono::microseconds>(stop - start);if (mmapTimes.size() > 30){std::cout << std::reduce(mmapTimes.begin(), mmapTimes.end()) / mmapTimes.size() << "\n";mmapTimes.clear();}mmapTimes.push_back(duration.count());}/* Re-queue the Request to the camera. */request->reuse(Request::ReuseBuffers);camera->queueRequest(request);}
I was expecting worse performance from mmapping each time in processRequest, but this is not the case. Did I incorrectly handle the pointer lookup when accessing the mmapped buffers?

Statistics: Posted by peytonicmaster — Thu Feb 08, 2024 7:30 am



Viewing all articles
Browse latest Browse all 6057

Trending Articles