Finding duplicated entries Announcing the arrival of Valued Associate #679: Cesar Manara Planned maintenance scheduled April 23, 2019 at 00:00UTC (8:00pm US/Eastern)Kalaha Solver: finding the marblesFinding the biggest product of two integers formed by the digits 1-9Finding Shortest PathFinding Local MaximaFinding most frequent words from Google N-Gram datasetFinding heaviest path through triangleFinding Permutations in HaskellFunctionaly Finding Prime Factors with MultiplicityFinding the longest common substring of multiple strings in HaskellProject Euler #81 in Haskell: minimal path sum through a matrix

Why aren't air breathing engines used as small first stages?

Project Euler #1 in C++

Crossing US/Canada Border for less than 24 hours

How would a mousetrap for use in space work?

What was the first language to use conditional keywords?

How to write this math term? with cases it isn't working

Is there hard evidence that the grant peer review system performs significantly better than random?

Why is Nikon 1.4g better when Nikon 1.8g is sharper?

What is the appropriate index architecture when forced to implement IsDeleted (soft deletes)?

Chinese Seal on silk painting - what does it mean?

If windows 7 doesn't support WSL, then what does Linux subsystem option mean?

How often does castling occur in grandmaster games?

How do I find out the mythology and history of my Fortress?

How does the math work when buying airline miles?

Can anything be seen from the center of the Boötes void? How dark would it be?

As a beginner, should I get a Squier Strat with a SSS config or a HSS?

Does lack of seasonality imply random time series?

Export Xpubkey from Bitcoin Core

What's the meaning of "fortified infraction restraint"?

Is there a kind of relay only consumes power when switching?

Source for Esri sample data from 911 Hot Spot Analysis

SF book about people trapped in a series of worlds they imagine

Why does the remaining Rebel fleet at the end of Rogue One seem dramatically larger than the one in A New Hope?

Using et al. for a last / senior author rather than for a first author



Finding duplicated entries



Announcing the arrival of Valued Associate #679: Cesar Manara
Planned maintenance scheduled April 23, 2019 at 00:00UTC (8:00pm US/Eastern)Kalaha Solver: finding the marblesFinding the biggest product of two integers formed by the digits 1-9Finding Shortest PathFinding Local MaximaFinding most frequent words from Google N-Gram datasetFinding heaviest path through triangleFinding Permutations in HaskellFunctionaly Finding Prime Factors with MultiplicityFinding the longest common substring of multiple strings in HaskellProject Euler #81 in Haskell: minimal path sum through a matrix



.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;








0












$begingroup$


Since the Esteemed Academia wants me to learn Haskell, and Haskell seems just weird for me, I thought that to train some basic Haskell skills I'd write a simple script I now need in Haskell.



This is what the script is supposed to do... I have a file that looks like this:



oo R
j
j ai I'm a comment
kd oo
stack j
a


I needed to extract duplicated elements from this file with lines they appear on. The catch is that the tab character denotes comments; so I'm a comment should not be considered here and in particular, this a from I'm a comment should not match this a that appears as the last element from these sample lines.



If fed this particular sample file, my script should report finding duplicate elements: oo on lines 1 and 4 and j on lines 2, 3 and 5. I believe I wrote a script that does this correctly.



Here it is:



import qualified Data.Map as Map
import Data.List
import Data.Function

main = do
input <- fmap preprocess getContents
printDupes $ findDupes input

preprocess input = let
untabbedLines = map (takeWhile (/= 't')) $ lines input
wordsedLines = map words untabbedLines
numberedLines = zip wordsedLines [1..]
numberedWords = concat $ map indexDown numberedLines where
indexDown (words', index) = map (flip (,) $ index) words'
in numberedWords

findDupes entries = let
occurences = Map.fromListWith (flip (++)) $ map (fmap (:[])) entries
in Map.filter ((>1).length) occurences

printDupes :: Map.Map String [Int] -> IO()
printDupes dupes = let
showPositions positions = intercalate ", " $ map show positions
printDupe (dupe, positions) = putStrLn $ "Duplicated element " ++ dupe ++ " found on positions: " ++ showPositions positions
in let
sortedDupes = sortBy (compare `on` snd) $ Map.assocs dupes
in mapM_ printDupe sortedDupes


The task is simple, but I guess it's OK for learning how to code in Haskell...



So could you kindly review this self-imposed exercise? What could've been done better than I did? Simpler? Shorter? I'm sure much, but what precisely? Also how to improve readability?










share|improve this question









$endgroup$


















    0












    $begingroup$


    Since the Esteemed Academia wants me to learn Haskell, and Haskell seems just weird for me, I thought that to train some basic Haskell skills I'd write a simple script I now need in Haskell.



    This is what the script is supposed to do... I have a file that looks like this:



    oo R
    j
    j ai I'm a comment
    kd oo
    stack j
    a


    I needed to extract duplicated elements from this file with lines they appear on. The catch is that the tab character denotes comments; so I'm a comment should not be considered here and in particular, this a from I'm a comment should not match this a that appears as the last element from these sample lines.



    If fed this particular sample file, my script should report finding duplicate elements: oo on lines 1 and 4 and j on lines 2, 3 and 5. I believe I wrote a script that does this correctly.



    Here it is:



    import qualified Data.Map as Map
    import Data.List
    import Data.Function

    main = do
    input <- fmap preprocess getContents
    printDupes $ findDupes input

    preprocess input = let
    untabbedLines = map (takeWhile (/= 't')) $ lines input
    wordsedLines = map words untabbedLines
    numberedLines = zip wordsedLines [1..]
    numberedWords = concat $ map indexDown numberedLines where
    indexDown (words', index) = map (flip (,) $ index) words'
    in numberedWords

    findDupes entries = let
    occurences = Map.fromListWith (flip (++)) $ map (fmap (:[])) entries
    in Map.filter ((>1).length) occurences

    printDupes :: Map.Map String [Int] -> IO()
    printDupes dupes = let
    showPositions positions = intercalate ", " $ map show positions
    printDupe (dupe, positions) = putStrLn $ "Duplicated element " ++ dupe ++ " found on positions: " ++ showPositions positions
    in let
    sortedDupes = sortBy (compare `on` snd) $ Map.assocs dupes
    in mapM_ printDupe sortedDupes


    The task is simple, but I guess it's OK for learning how to code in Haskell...



    So could you kindly review this self-imposed exercise? What could've been done better than I did? Simpler? Shorter? I'm sure much, but what precisely? Also how to improve readability?










    share|improve this question









    $endgroup$














      0












      0








      0





      $begingroup$


      Since the Esteemed Academia wants me to learn Haskell, and Haskell seems just weird for me, I thought that to train some basic Haskell skills I'd write a simple script I now need in Haskell.



      This is what the script is supposed to do... I have a file that looks like this:



      oo R
      j
      j ai I'm a comment
      kd oo
      stack j
      a


      I needed to extract duplicated elements from this file with lines they appear on. The catch is that the tab character denotes comments; so I'm a comment should not be considered here and in particular, this a from I'm a comment should not match this a that appears as the last element from these sample lines.



      If fed this particular sample file, my script should report finding duplicate elements: oo on lines 1 and 4 and j on lines 2, 3 and 5. I believe I wrote a script that does this correctly.



      Here it is:



      import qualified Data.Map as Map
      import Data.List
      import Data.Function

      main = do
      input <- fmap preprocess getContents
      printDupes $ findDupes input

      preprocess input = let
      untabbedLines = map (takeWhile (/= 't')) $ lines input
      wordsedLines = map words untabbedLines
      numberedLines = zip wordsedLines [1..]
      numberedWords = concat $ map indexDown numberedLines where
      indexDown (words', index) = map (flip (,) $ index) words'
      in numberedWords

      findDupes entries = let
      occurences = Map.fromListWith (flip (++)) $ map (fmap (:[])) entries
      in Map.filter ((>1).length) occurences

      printDupes :: Map.Map String [Int] -> IO()
      printDupes dupes = let
      showPositions positions = intercalate ", " $ map show positions
      printDupe (dupe, positions) = putStrLn $ "Duplicated element " ++ dupe ++ " found on positions: " ++ showPositions positions
      in let
      sortedDupes = sortBy (compare `on` snd) $ Map.assocs dupes
      in mapM_ printDupe sortedDupes


      The task is simple, but I guess it's OK for learning how to code in Haskell...



      So could you kindly review this self-imposed exercise? What could've been done better than I did? Simpler? Shorter? I'm sure much, but what precisely? Also how to improve readability?










      share|improve this question









      $endgroup$




      Since the Esteemed Academia wants me to learn Haskell, and Haskell seems just weird for me, I thought that to train some basic Haskell skills I'd write a simple script I now need in Haskell.



      This is what the script is supposed to do... I have a file that looks like this:



      oo R
      j
      j ai I'm a comment
      kd oo
      stack j
      a


      I needed to extract duplicated elements from this file with lines they appear on. The catch is that the tab character denotes comments; so I'm a comment should not be considered here and in particular, this a from I'm a comment should not match this a that appears as the last element from these sample lines.



      If fed this particular sample file, my script should report finding duplicate elements: oo on lines 1 and 4 and j on lines 2, 3 and 5. I believe I wrote a script that does this correctly.



      Here it is:



      import qualified Data.Map as Map
      import Data.List
      import Data.Function

      main = do
      input <- fmap preprocess getContents
      printDupes $ findDupes input

      preprocess input = let
      untabbedLines = map (takeWhile (/= 't')) $ lines input
      wordsedLines = map words untabbedLines
      numberedLines = zip wordsedLines [1..]
      numberedWords = concat $ map indexDown numberedLines where
      indexDown (words', index) = map (flip (,) $ index) words'
      in numberedWords

      findDupes entries = let
      occurences = Map.fromListWith (flip (++)) $ map (fmap (:[])) entries
      in Map.filter ((>1).length) occurences

      printDupes :: Map.Map String [Int] -> IO()
      printDupes dupes = let
      showPositions positions = intercalate ", " $ map show positions
      printDupe (dupe, positions) = putStrLn $ "Duplicated element " ++ dupe ++ " found on positions: " ++ showPositions positions
      in let
      sortedDupes = sortBy (compare `on` snd) $ Map.assocs dupes
      in mapM_ printDupe sortedDupes


      The task is simple, but I guess it's OK for learning how to code in Haskell...



      So could you kindly review this self-imposed exercise? What could've been done better than I did? Simpler? Shorter? I'm sure much, but what precisely? Also how to improve readability?







      haskell






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked 11 mins ago









      gaazkamgaazkam

      226113




      226113




















          0






          active

          oldest

          votes












          Your Answer






          StackExchange.ifUsing("editor", function ()
          StackExchange.using("externalEditor", function ()
          StackExchange.using("snippets", function ()
          StackExchange.snippets.init();
          );
          );
          , "code-snippets");

          StackExchange.ready(function()
          var channelOptions =
          tags: "".split(" "),
          id: "196"
          ;
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function()
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled)
          StackExchange.using("snippets", function()
          createEditor();
          );

          else
          createEditor();

          );

          function createEditor()
          StackExchange.prepareEditor(
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: false,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: null,
          bindNavPrevention: true,
          postfix: "",
          imageUploader:
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          ,
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          );



          );













          draft saved

          draft discarded


















          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f217702%2ffinding-duplicated-entries%23new-answer', 'question_page');

          );

          Post as a guest















          Required, but never shown

























          0






          active

          oldest

          votes








          0






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes















          draft saved

          draft discarded
















































          Thanks for contributing an answer to Code Review Stack Exchange!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid


          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.

          Use MathJax to format equations. MathJax reference.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f217702%2ffinding-duplicated-entries%23new-answer', 'question_page');

          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          名間水力發電廠 目录 沿革 設施 鄰近設施 註釋 外部連結 导航菜单23°50′10″N 120°42′41″E / 23.83611°N 120.71139°E / 23.83611; 120.7113923°50′10″N 120°42′41″E / 23.83611°N 120.71139°E / 23.83611; 120.71139計畫概要原始内容臺灣第一座BOT 模式開發的水力發電廠-名間水力電廠名間水力發電廠 水利署首件BOT案原始内容《小檔案》名間電廠 首座BOT水力發電廠原始内容名間電廠BOT - 經濟部水利署中區水資源局

          格濟夫卡 參考資料 导航菜单51°3′40″N 34°2′21″E / 51.06111°N 34.03917°E / 51.06111; 34.03917ГезівкаПогода в селі 编辑或修订