Difference between revisions of "GSOC2021SEM"

From Nsnam
Jump to: navigation, search
(Week 5 (July 5 - July 12))
(Project Overview)
 
(9 intermediate revisions by 2 users not shown)
Line 20: Line 20:
 
* '''Project Proposal:''' [https://docs.google.com/document/d/1kkAAdj9Jo5xP79eDzEnu5zl5y88DmXNGzlZCRlsVWwM/edit?usp=sharing ProjectProposal]
 
* '''Project Proposal:''' [https://docs.google.com/document/d/1kkAAdj9Jo5xP79eDzEnu5zl5y88DmXNGzlZCRlsVWwM/edit?usp=sharing ProjectProposal]
 
* '''Design Document:''' [https://docs.google.com/document/d/1GWQFEF1my4VmCnKayGZGYj6lwtYFQeE5qFI5emJlbOw/edit?usp=sharing DesignDocument]
 
* '''Design Document:''' [https://docs.google.com/document/d/1GWQFEF1my4VmCnKayGZGYj6lwtYFQeE5qFI5emJlbOw/edit?usp=sharing DesignDocument]
 +
* '''Final Report:''' [https://akshitpatel01.github.io/GSoC-2021-Report/ Final Report]
 
* '''About Me:''' I am a final year undergraduate student pursuing Computer Science and Engineering at the National Institute of Technology Karnataka (NITK), India. I am interested in computer networks and operating systems. I have about 2.5 years of experience in C, C++, Python, and GO.
 
* '''About Me:''' I am a final year undergraduate student pursuing Computer Science and Engineering at the National Institute of Technology Karnataka (NITK), India. I am interested in computer networks and operating systems. I have about 2.5 years of experience in C, C++, Python, and GO.
  
Line 84: Line 85:
 
* Finished working on suggestions/comments discussed on the [https://github.com/signetlabdei/sem/pull/51 phase2-PR].
 
* Finished working on suggestions/comments discussed on the [https://github.com/signetlabdei/sem/pull/51 phase2-PR].
 
* Added tests and python docstrings for the phase 2 code.
 
* Added tests and python docstrings for the phase 2 code.
 +
 +
===Week 6 (July 12 - July 19)===
 +
* Performed experiments regarding the overall time taken by phase 2 code to parse and filter log files with different sizes ranging from 100M to 1000M (in intervals of 100M). Also performed code profiling to figure out the time taken by individual functions to run. Discussion on these [https://github.com/signetlabdei/sem/pull/51#issuecomment-880504533 results] is yet to be done.
 +
* Added a few more tests for the regex that parses the logs from the provided log file. Also updated the pytest structure to remove code duplication.
 +
* Started working on the flask backend and linking it with a sample [https://datatables.net/ datatable](JQuery plugin that is used to make interactive tables) in the frontend.
 +
* Phase 2 PR: [https://github.com/signetlabdei/sem/pull/51 link]
 +
 +
===Week 7 (July 19 - July 26)===
 +
* Had a discussion with the mentors on code-profiling results and deduced that a significant amount of time was wasted because of a deepcopy used in inserting logs. As this function is not supposed to be called by the user, we decided to remove the deepcopy and reduced the overall time for inserting logs significantly.
 +
* Removed a case where the input user arguments were directly modified (as python lists are mutable this can cause unexpected behavior for the users).
 +
* Split the regex for parsing logs using named groups. This breaks down large regex into smaller chunks and makes it easier to maintain regex.
 +
* Worked on creating dropdowns for different filters followed by creating JQuery functions to pass the selected parameters to Flask backend.
 +
* Phase 2 PR: [https://github.com/signetlabdei/sem/pull/51 link]
 +
 +
===Week 8 (July 26 - August 2)===
 +
* Merged phase 2 code into SEM ([https://github.com/signetlabdei/sem/pull/51 PR]).
 +
* The phase 1 and phase 2 code can be found in the SEM repository [https://github.com/signetlabdei/sem/tree/gsoc2021 gsoc2021].
 +
* Created a dashboard with an interactive table and filters for context, calling function, log severity class, log component, and time. The code can be found [https://github.com/akshitpatel01/sem/tree/gsoc-phase2/dashboard here].
 +
 +
===Week 9 (Aug 2 - Aug 9)===
 +
* Had a discussion with the mentors on the time complexity of jittering large log files. The current implementation is not feasible for a large file and hence we came up with two possible solutions to tackle this issue. These approaches are yet to be tested.
 +
* Added an interactive graph for viewing the logs in addition to the table.
 +
* Worked on suggestions/comments of the mentors to improve the dashboard.
 +
* The latest dashboard code can be found [https://github.com/akshitpatel01/sem/tree/gsoc-phase2/dashboard here].
 +
 +
===Week 10 (Aug 9 - Aug 16)===
 +
* After a discussion with the mentors, tuned jitter logs function for better performance and as a result, the performance of this function is significantly improved.
 +
* Added a complete overview of the current dashboard features [https://github.com/akshitpatel01/sem/blob/gsoc2021/sem/dashboard/README.md here].
 +
* Worked on suggestions/comments of the mentors to improve the dashboard.
 +
* Created a [https://github.com/signetlabdei/sem/pull/54 PR] for the dashboard.
 +
* Find the latest code [https://github.com/akshitpatel01/sem/tree/gsoc2021 here].

Latest revision as of 18:47, 23 August 2021

Main Page - Current Development - Developer FAQ - Tools - Related Projects - Project Ideas - Summer Projects

Installation - Troubleshooting - User FAQ - HOWTOs - Samples - Models - Education - Contributed Code - Papers

Back to GSoC 2021 projects

Project Overview

  • Project Name: Add logging support to Simulation Execution Manager (SEM)
  • Student: Akshit Patel
  • Mentors: Davide Magrin, Mattia Lecci
  • Project Goals: The project's original aim is to add support for ns-3’s built-in logging to SEM and provide users with an interactive dashboard to visualize large ns-3 generated logs. The users will be able to enable specific log components at a specified log level for their ns-3 scripts ran with SEM. The dashboard will consist of an interactive table and a graph with different filter options to easily visualize the logs generated either directly from SEM or by manually passing a log file that is generated from ns-3 simulations. The following features will be added to SEM at the end of this project:
    • Enable logging through simulations using SEM.
    • Enable logging through simulations using SEM command-line interface.
    • Visualize log files (generated by SEM or ns-3 directly) efficiently using a dashboard.
    • Add tests (using pytest) to validate the newly added logging functions/APIs.
  • Future Work: I will be implementing the following tasks during the Gsoc period(in addition to the goals above) if time permits or will be taken care of after the project is over. These are the originally planned additional tasks:
    • Add support for time filtering of logs in ns-3: Ns-3 does not support time filtering of logs as of now. This feature will be useful for the users as the log files generated by ns-3 can be huge and this might directly affect the responsiveness of the SEM dashboard that is to be built.
    • Add additional examples to SEM: As SEM does not provide a large variety of examples, it would be beneficial to add new examples for easier understanding of SEM APIs for new users. Upon further discussion with the mentors, I will be adding additional examples.
  • Repository: RepoLink
  • Project Proposal: ProjectProposal
  • Design Document: DesignDocument
  • Final Report: Final Report
  • About Me: I am a final year undergraduate student pursuing Computer Science and Engineering at the National Institute of Technology Karnataka (NITK), India. I am interested in computer networks and operating systems. I have about 2.5 years of experience in C, C++, Python, and GO.

Milestones and Deliverables

The following are the planned Milestones and Deliverables at end of each phase:

Phase 1

  • Very brief first phase to get feedback from the mentors.
  • Modify the run_missing_simulations function to add logging support in SEM along with tests and documentation of the functions added/modified.

Phase 2

  • Add a function to read logs (generated by ns-3 directly or generated by SEM) from the path/to/logfile provided and parse the logs into (i) Timestamp (ii) Context (iii) FunctionName (iv) Function Arguments (v) Log Message
  • Add a function to use this parsed data to convert the logs in a list of dictionary format(or in JSON format maybe by using python’s logging module).
  • Add a function to use the list created in previous step and store the logs in a tinyDB instance.
  • Add functions to filter the logs from the tinyDB instance based on different parameters such as (i) Node Prefix Number (ii) Function name (iii) Log level (iv) Time window

Phase 3

  • Build components for the interactive dashboard. The initial plan includes using datatables.net for the interactive table and chart.js for the interactive graph.
  • Build a flask backend to connect to the dashboard. This backend will in turn use the filter functions created during phase 2.

Phase 4

  • Add support to enable logging in SEM simulations via SEM-CLI.
  • Add support for visualization using SEM-CLI.
  • Anything left out from the previous phases.
  • Additional Tasks:
    • Add support for time filtering of logs in ns-3: As discussed in the proposal above, ns-3 does not support time filtering of logs as of now. This feature will be useful for the users as the log files generated by ns-3 can be huge and this might directly affect the responsiveness of the SEM dashboard that is to be built.
    • Add support for custom context in SEM: Provide support for custom context created by users(using NS_LOG_APPEND_CONTEXT). Assumption: Everything between timestamp and function definition can be considered as the custom context with the context number being the first number in the context. Provide filter options based on the context number, but display the entire context to the user.
    • Add additional examples to SEM: As SEM does not provide a large variety of examples, it would be beneficial to add new examples for an easier understanding of SEM APIs for new users. Upon further discussion with the mentors, I will be adding additional examples(in addition to the examples discussed above).

Weekly Reports

Community Bonding Period

  • Prepared a wiki page for the project
  • Carried out experiments to check the performance of TinyDB (SEM internally uses this)and Dataframe in terms of inserting and querying a large number of logs. Based on these experiments, it was established that TinyDB can insert around 1 million logs(in JSON format) in around 4.5 seconds and thus TinyDB could be a decent choice for this use case.
  • Communicated with the mentors regarding the proposal and refactored the milestones.
  • Started working on a design document to formalize the approach to be followed during the coding phase.

Week 1 (June 7 - June 14)

  • Agreed on the design structure for the functions to be added/modified. This document contains the design in more detail.
  • Added preliminary support to enable logging for simulations ran using SEM. This allows the users to enable logging for multiple parameter combinations. SEM will also store the logging results in addition to the non-logging results in the database.

Week 2 (June 14 - June 21)

  • Completed adding support to enable logs in SEM.
  • Added python docstrings and tests in the pytest framework for the new/modified functions.
  • Created a PR for adding logging support to SEM. Also had a thorough discussion(can be viewed here) on the PR and worked on the changes requested by the mentors.

Week 3 (June 21 - June 28)

  • Resolved errors encountered after testing the new logging module added.
  • Created a simple example to demonstrate the workflow of logging in SEM.
  • Finished working on the suggestions/comments discussed on the PR.
  • Updated PR: link
  • Updated the design document (for phase 2) in accordance with the implemented phase 1(Refer: link).

Week 4 (June 28 - July 5)

  • Added user-oriented overview of the project in the design document(link).
  • Solved issues related to querying results based on log components.
  • Merged phase 1 code into SEM (PR).
  • Started working on phase 2 and created a new PR for phase 2.
  • Complete phase 1 code can be used on gsoc2021 branch.
  • Phase 2 code can be viewed on gsoc-phase2 branch.

Week 5 (July 5 - July 12)

  • Updated phase 3 design in the design document.
  • Finished working on suggestions/comments discussed on the phase2-PR.
  • Added tests and python docstrings for the phase 2 code.

Week 6 (July 12 - July 19)

  • Performed experiments regarding the overall time taken by phase 2 code to parse and filter log files with different sizes ranging from 100M to 1000M (in intervals of 100M). Also performed code profiling to figure out the time taken by individual functions to run. Discussion on these results is yet to be done.
  • Added a few more tests for the regex that parses the logs from the provided log file. Also updated the pytest structure to remove code duplication.
  • Started working on the flask backend and linking it with a sample datatable(JQuery plugin that is used to make interactive tables) in the frontend.
  • Phase 2 PR: link

Week 7 (July 19 - July 26)

  • Had a discussion with the mentors on code-profiling results and deduced that a significant amount of time was wasted because of a deepcopy used in inserting logs. As this function is not supposed to be called by the user, we decided to remove the deepcopy and reduced the overall time for inserting logs significantly.
  • Removed a case where the input user arguments were directly modified (as python lists are mutable this can cause unexpected behavior for the users).
  • Split the regex for parsing logs using named groups. This breaks down large regex into smaller chunks and makes it easier to maintain regex.
  • Worked on creating dropdowns for different filters followed by creating JQuery functions to pass the selected parameters to Flask backend.
  • Phase 2 PR: link

Week 8 (July 26 - August 2)

  • Merged phase 2 code into SEM (PR).
  • The phase 1 and phase 2 code can be found in the SEM repository gsoc2021.
  • Created a dashboard with an interactive table and filters for context, calling function, log severity class, log component, and time. The code can be found here.

Week 9 (Aug 2 - Aug 9)

  • Had a discussion with the mentors on the time complexity of jittering large log files. The current implementation is not feasible for a large file and hence we came up with two possible solutions to tackle this issue. These approaches are yet to be tested.
  • Added an interactive graph for viewing the logs in addition to the table.
  • Worked on suggestions/comments of the mentors to improve the dashboard.
  • The latest dashboard code can be found here.

Week 10 (Aug 9 - Aug 16)

  • After a discussion with the mentors, tuned jitter logs function for better performance and as a result, the performance of this function is significantly improved.
  • Added a complete overview of the current dashboard features here.
  • Worked on suggestions/comments of the mentors to improve the dashboard.
  • Created a PR for the dashboard.
  • Find the latest code here.